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NOTE ON THE SPREAD OF A STATE IN SMALL 
SOCIAL GROUPS 


GEORG KARLSSON * 
COMMITTEE ON MATHEMATICAL BIOLOGY 
THE UNIVERSITY OF CHICAGO 


A model is developed for the spread of a state in small social groups. Under suitable assump- 
tions the model exhibits formal identity with Markov chain theory. The basic theorems and 
classifications of Markov chain theory are stated and interpreted in terms of the model. 
Finally, some procedures for testing the model are indicated. 


Hitherto the theoretical studies of spread of states in social groups 
such as involved in imitation have been confined largely to very large 
groups and have led to nonlinear differential equations (Landahl, 1950; 

-Landau, 1950; Rashevsky, 1949, 1950). E. Trucco’s model (1954) is 
similar to those mentioned but can also be applied to small groups. To- 
tally different is Rapoport’s theory of rumor spread (1953). The purpose 
of this note is to suggest a method applicable to small groups and to de- 
rive some theorems which can be verified in principle by proper observa- 
tions or experiments. 

A mathematical model is suggested in this paper which shows how the 
behavior of one individual at one point of time depends upon the behavior 
of the other individuals he contacted at the preceding point of time. 
(Behavioral events are assumed to take place simultaneously for all indi- 
viduals with finite time intervals between successive events.) The present 
model also describes the dependence on one’s own behavior at the preced- 
ing point of time. Further characteristics of the model are: 1) It takes into 
account all the persons contacted. In the sequel these persons are called 
the members of the group. 2) It allows different persons to have different 
degrees of influence. 3) It takes a large number of behaviors into account. 

Let the number of group members be n. Because of computational 
difficulties m has to be a fairly small integer. Let the number of possible 
behaviors be m. A point of time will be a period of time, a month, a day, 
an hour, depending on the kind of behavior the model is applied to. Let 
MG be the proportion of the time period ¢ spent in behavior 7 by indi- 
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vidual i; Db‘) = 1, of course. Thus the vector (oo ... dS?) completely 
specifies the behavior of person 7 during period / (or time #). The initial 
state of the whole group is given by the matrix 


(0) 7 (0) (0) 
Nl a risar oS 
Foe eno 


2: 100 e/ejie (es) fo el emel 


ORO 380)” 


Two main assumptions are made in this model which make it possible 
to work out the model, but at the same time restrict its applicability 
considerably. 

I. The behavior of each individual is influenced only by the behavior 
of the group members, himself included, at the time preceding the one at 
which the behavior takes place. 

II. The influence of an individual on another individual is constant 
during the time the model is applied. 

Under these circumstances the influence pattern of the group can be 
described by an influence matrix (az:). Each element a;; denotes the in- 
fluence by individual z on individual k. The influences are normalized 
so that 


DS ei =1, tic 0s 


Given the initial state of the group (6%) and the influence matrix 
(a;:), our basic assumption is 


(b2) = (an) (02). (1) 


This means that the proportion of time spent by individual 2 in activity 
j at time 1 is a weighted mean of the proportions spent in activity 7 by 
all group members at time 0, the weights being the influence coefficients 
of the various group members on person k. 

The next state at time ¢ = 2 will then be given by 


(2) = (ayy 2) = ap? OO). (2) 
And, in general, 
(bie) = tee) OL). (3) 


This model does not deal with probabilities. It is completely determinis- 
tic. However, the influence matrix is a stochastic matrix through the 
normalization. Thus we can investigate the behavior of (a;,)* and con- 
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sequently the behavior of the whole model by the use of Markov chain 
theory. 

It should be noted that to the transition probability p,; from & to i 
corresponds in our model the direct influence a,; from 7 to k. This re- 
versal of the process should be kept in mind, particularly when referring 
to indirect influence in ¢ steps corresponding to the probability of getting 


from one state to another in ¢ steps. Thus SS Pripsi is the probability 
Fj 


of getting from state k to state 7 in two steps, > ay;a;; the indirect 
- 


influence from person 7 to person & in two steps. 
Theorem 1. No new behaviors can appear in the group as ¢ increases. 
Proof: obvious from the fact that the d{’s are probability mixtures of 
the 5{°’s. 
Theorem 2. Max; bS? > b‘, for all j and all ¢. 
Proot: (6%) = (as): (bf). If (axx) is a stochastic matrix so is (a;x)¢. 


Let the elements of (a;,)' be aS} (aS? > 0, 2 a) = 1). Then b{f) = 


: OG a eMaxy > (7 = 1, 22>. osm): 
k 
In analogy with the classification of states in the Markov theory we can 


carry through a classification of persons. 

1) Recurrent or transient: A person 7 is called a person with transient 
behavior if his influence tends to disappear as the interaction process goes 
on, i.e., a7 — 0 for all k as m — ~. Otherwise a person is called a person 
with Fenbvent behavior. 

2) Periodic or nonperiodic: A person is called a person with periodic 
behavior if he does not influence himself except after s, 2s, 3s, ... steps 
and through at least one other person. A person with nonperiodic, recurrent 
behavior is called a person with ergodic behavior. Persons with periodic be- 
havior would be highly exceptional, since they are not directly influenced 
by their own previous behavior at all, i.e., ai: = 0. 

We also get an analogous classification of groups. A set C of persons in 
the group is called closed if there is no positive influence from any person 
outside the set on any person in the set, i.e., a; = 0 whenever & is contained 
in C, but zis not. A group is called irreducible if there are no closed sets other 
than the set of all persons in the group. Otherwise it is called decomposable. 

Theorem 3. In an aperiodic, irreducible group there exists a unique 
distribution of behavior as f increases; (0{!+?) — (0{?) +0 ast ~. 

Proof: The theorem follows from Corollary II, p. 329 of W. Feller 
(1950) plus the fact that m, the number of persons in the group, is finite. 
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Theorem 4. In an irreducible group, all members of which exhibit 
ergodic behavior, the limit 


jim ag? =u; > 0 (4) 


exists for every pair , i and is independent of k. Moreover, Zu; = 1, 
and the w,’s satisfy the system of linear equations 


b= >) Ur dr; G=e 1 eee (5) 


Proof: This is just a formulation in our language of part of Theorem 2, 
p. 325 in Feller (Joc. cit.). 

Theorem 5. If there is a closed set C in the group, the final influences 
by C on the different persons & are given by the solutions of the system of 
linear equations 


(1) 
Xi De AgyXy = xy ’ (6) 


Tf 


where J indicates that the sum is over all persons with transient behavior 


and 
(1) 
ty, = > ai , 
Cc 


with the sum over all z in C. 

Proof: A proof analogous to Feller’s proof of the corresponding theorem 
in Markov theory is written out because of the somewhat different inter- 
pretation in the present model. 

Let x, be the influence on k from C in exactly ¢ steps. Then 


X= > x,” (7) 
t=1 
is the total influence on k from C; 
te = >> ans iC, (8) 
Cc 


The influence on k from C in exactly ¢ + 1 steps must come via a person 
with transient behavior after ¢ steps. Thus 


(e+1) (2) 
Vk aa 2 AyvXy, . (9) 


Equations (8) and (9) are recurrence relations uniquely determining x"). 
Adding (9) fort = 1, 2,3... , we get the result stated in the theorem. _ 
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If the group contains only one closed set, consisting of one person, then 
in the long run everyone in the group will divide his activities into exactly 
the same time proportions as this one individual in the closed set. If there 
is just one closed set but with two or more members, the set will reach an 
asymptotic behavior distribution according to Theorem 2, and this dis- 
tribution will be imposed on all the transient members of the whole group. 
If there is more than one closed set in the group, the final behavior dis- 
tribution of the transient group members will depend on the strength of 
the influences from the members in the respective closed sets. 

Because of the restrictions put on this model by the strong assumptions 
made, it is not to be expected that it will fit any actual situation very well. 
However, the data necessary for a test of the model should not be difficult 
to collect. All that has to be observed are the proportions of time spent 
by the various group members in the different behavior classes to which 
the model is applied. 

Methods for testing the fit to the data are also available. An estimate 
of the influence matrix (a;;) can be made by the least squares methods as 
developed by G. A. Miller (1952). Attempts could also be made to esti- 
mate the a;,’s directly. For various tentative methods of measuring influ- 
ence, see R. Lippitt, N. Polansky, F. Redl, and S. Rosen (1952). 


The author is very much obliged to Dr. H. G. Landau for a number of 
discussions of problems relating to this note. 
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NOTE ON THE THEORY OF MASS BEHAVIOR 


Joun Z. HEARON 
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Oak RIDGE, TENNESSEE 


A differential equation has been derived by A. Rapoport, Bull. Math. Biophysics, 14, 159 
(1952), giving the time course of the fraction of the population who have performed a given 
act. The general solution of this equation is obtained, some properties of the solution are de- 
duced, and a special case presented in detail. 


Introduction. In a paper on the theory of the propagation of a single 
act in a large population, A. Rapoport (1952) derived the following equa- 
tion 

dF 

“ie Che ele Or |e (1) 
Here F is the fraction of the population who have conformed, x(¢) is the 
external influence or stimulus, and 0F represents the imitation component. 
In (1) it is supposed that 6 > 0 is constant and that the threshold, for 
external stimuli and for the social component, is unity (see Rapoport, loc. 
cit. for discussion). 


For the Cases* (i) 6 = 0, x(¢) #0, the imitation-free case; (ii) «(¢) = constant ¥ 0, 
b ~ 0; and (iii) x(t) = 0, b ¥ 0, the pure imitation case, (1) can be solved either as a linear 
equation [Case (i)] or by separation of variables [Cases (ii) and (iii)]. In addition to these 
cases Rapoport considered Case (iv) x(¢) = at, b ~ 0, and compared this to the corresponding 
imitation-free case by comparing the coefficients in the series solution to those in the series 
expansion of the solution for the corresponding Case (i) situation. 


An interesting feature which can be discussed in terms of (1), and which 
was pointed out by Rapoport, is that of the modification imposed by the 
presence of the social imitation component upon the imitation-free curve. 
This point appears, in fact, to be amenable to experimental check in ways 
discussed by Rapoport. In order to check the theory underlying (1) against 
experimental data or existing statistical information it will be necessary to 
have available a solution of (1), either analytical or numerical, in which 
x(t) is some prescribed function and 6 ~ 0, and to compare this solution 
to the limiting case b — 0. 


* The enumeration of cases in this paper does uot correspond to that of Rapoport (loc. cit.). 
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It is the purpose of this note to point out that by recognizing (1) as the 
Ricatti equation the general solution can be formally indicated in terms 
of quadratures. In particular, the Case (iv) solution is presented in terms 
of elementary functions and a tabulated function, the normalized proba- 
bility integral. 

The Ricatti equation. If (1) is written in the form 


dF 


Fim 40 (t) + Ai () F+ Az (t) F? ; (2) 
where 
Ay (it) =x (4), 
A, (i) =b—<«(), (3) 
A, (Qt) =—), 


it is recognized at once as the generalized Ricatti equation (e.g., Rainville, 1943). The proper- 
ties of this equation are well known. For our purposes it is enough to have the following 
property: If a particular solution, say Fi(t), is known then the substitution 


PO =AO +97 (4) 


leads to a linear first-order equation in V (¢) 
V'+ [41 @ +27, () 42()1]V = — Ao (4), (S) 

which can be solved by two quadratures. 
The general solution. From inspection of (1), or its equivalent (2), it is 


evident that F\(#) = 1 is a particular solution. The substitution (4) then 
leads to 


V’— [b+e4()]V=b. (6) 
The solution of (6) is 
V @®) =e [YW @) +2] (7) 
where & is an arbitrary constant and 
DAG reg 
= + fx (8) dé = b+ ¢ © (8) 
ne eee 
vO) =b feU@de. (9) 
From the initial condition (4) and (7) it follows that 
= — Pees 1 — 
V (0) = ha yes (10) 


and from (4), (7), and (10), 


ss. e— U(t) 
F() =1—-je0 et. (11) 
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Equation (11), with the definitions (8) and (9), represents the general 
solution of (2). The actual treatment of (11) will depend, in any given 
case, upon whether or not the integrals (8) and (9) can be obtained in 
closed form. 

Some properties of the general solution. There are certain properties of 
F(t) which can be deduced simply from (11) and others which can be de- 
duced directly from the differential equation (1). 

Some of these are intuitively obvious and would appear necessarily true 
from the context of Rapoport’s theory. Still it seems desirable to present 
the formal proofs here and to show that the general solution, which has 
not been previously exhibited, actually possesses those properties demand- 
ed by common-sense considerations of the model on which it was based. 

From the way in which (1) was derived, that equation and its solution, 
(11), make no sense unless 

we ON (12) 


Further, from the interpretation of (1), it is necessary that 
O=5F (0) aie (13) 


We here exclude the trivial case F(0) = 1, in which F(é) = 1 for all ¢. In 
what follows the function (11) is denoted by F(é) when it is understood 
that b # 0 and when 6 = 0 it is denoted by F,(é). It is of course under- 
stood that F(0) = F,(0) and that ¢ is reckoned as positive. We wish to 
show that (12) and (13) imply the following properties: 
I. For all ¢ and any 6,0 < F(@) <1. 
II. For any b > 0, F(~) = 1; but Fo(~) = 1 only if the integral g(~) 
does not converge. 
Til. For all ¢, Fo(é) < F(é). The equality sign holds at t = if the inte- 
gral g(~) diverges and it obviously holds at ¢ = 0. Further, (0F)/ 
(06) > 0 for all ¢ and b. 
IV. For all ¢, F’() > 0. If F(0) = 0, then F’(0) = 0 and F’”(0) #0, 
provided that «(0) = 0 and «’(0) + 0. 
V. Up to second-order terms in #, F(t) = Fo(t) #0 if, and only if, 
F(0) = 0, «(0) = 0, and* x’(0) # 0. 
In order to prove I, it must be shown that the quantity in brackets in 
(11) is non-negative and equal to or less than unity. That it is non-nega- 
tive will follow if it can be shown that 


K2y(). (14) 


* Note, in connection with IV and V, that in view of the restriction (12), if «(0) = 0 then 
x'(0) = 0. The assertion x’(0) # 0 can then be replaced everywhere by x’(0) > 0. 
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From (13) and (10), K > 1 and by a well-known theorem y<MiA- 
e-*) where M is the maximum value of e4@) on the intervalO < & <2. 
From (12) and the definition of g(#), M = 1. Therefore V(t) < 1 for all ¢. 
The assertion (14) follows at once. The quantity in brackets in (11) is 
equal to or less than unity if 


ebte-9(t) < K —p (2) (15) 
where the definition (8) has been used in writing (15). It will be shown that 
ents Kamp ()- (16) 


If (16) is valid it is clear, from (12), that (15) holds. Let (16) be writ- 
ten as 


t 
K-v(j 21-6 bd Ee. by. 
y@21—of e*dk (17) 
Now it has been seen that K > 1, and also that 
t 
—p() >—bf en%dé. 
y() = fe £ (18) 


This proves (17) and hence completes the proof of I. 

From (11) and (12) it is evident that Fo(o) < 1 if g(@) is finite, 
but Fo(~) = 1 if g(@) diverges. For 6 > 0, the numerator of the 
quantity in brackets in (11) approaches zero as é >, and (14) 
shows that the integral y() converges. Therefore, F(~) = 1, provided 
that Y() does not converge to the value K. The only case in which y(~) 
does converge to the value K is trivial. For y() has at most the value 
unity, actually obtaining this value only if x«(¢) = 0, and K has the value 
unity only if F(0) = 0. Plainly, from (11), the conditions «(#) = 0 and 
F(O) = 0 yield the trivial result F(é) = 0 for all ¢. Rapoport (Joc. cit.) has 
already noted that in the imitation-free case the zero initial condition 
leads to this trivial situation. This completes the proof of II. 

The validity of the first statement in III will follow if it is shown that 


Ke¥*<K—yi(d). (19) 
But (19) can be written as 


t t 
K|1 - of ed s| 2x of e~% ead E (20) 
which reduces to 
K>=> e-a(t*) (21) 


where ¢* lies on the interval 0 < ¢* < #. From (12) and the definition of 
g(t), the right-hand side of (21) is at most unity and it has been seen that 
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K is at least unity. Therefore (21) and hence (19) holds. The equality in 
(19) holds when ¢ = 0, and also if x(0) = 0 and F(0) = 0 (which has been 
discussed as trivial). Evidently, from II, the equality sign in the first 
statement of II holds at ¢#= © if g(@) diverges. The proof of the last 
statement in IIT proceeds as follows: From (4), with F,(é) = 1, 


oF” Py 


0b” ~~ Va gb" oe 
It must be shown, therefore, that 
aV 
aa 
a5 = 0. (23) 
From (6) 
0 0V OV 
sine [x +b] 5 -=1+V. (24) 
Now if 0V/d0 satisfies (24), it obviously satisfies 
oe ev) f U+V (8) e-V@dE (25) 


where the fact that (@V/0b) = 0 at ¢ = 0, readily verified from (7), has 
been used. From (25), (23) follows if 
14+V@% <9. (26) 
But (26) is clearly true since, from (4) 
V@F@® =14+V. 
By I, F(é) is non-negative and by (7), (10), and (14), V(d) is negative. 
This completes the proof of III. 


From (1), with (12) and I, it follows that F’(#) > 0 which is the first 
statement in IV. From (1) and the condition F(0) = 0, it follows that 


FE’ (QO) =< (0), (256) 
From differentiation of (1), 
Peel) ee) SOF | Fe ory (28) 
and, if F(0) = 0, 
F”’ (0) = x’ (0) + bx (0) — x? (0). (29) 


From (27) and (29) the second statement of IV follows. 
The expansion of F(#) about the point ¢ = 0 is given, using (27), (29), 
and F(0) = 0, by P 
F(t) =x (0) + [x’ (0) + bx (0) —a?(O)]5+... - (30) 


The truth of V is evident from (30). 


12 JOHN Z. HEARON 


Discussion of I-V. As noted, the property in I is intuitively obvious 
from the interpretation and mode of derivation of (1). The formal proof 
has been exhibited and shows the manner in which this property depends 
upon (12) and (13). The property II shows that for any x(t), subject only 
to the restriction (12), the limiting value of F(#) is unity. The sharpest 
contrast between the cases b = 0 and } # 0 will be obtained when g(~) 
converges. The first statement in ITI shows that the curve F (t) lies above 
the imitation-free curve, Fo(t), everywhere except at ¢ = 0 and possibly 
att = o.Itisnot obvious from this that F(be, ft) > F(t, t) if be > b: > 0. 
But the last statement in III does insure this property. It is easily shown 
by direct differentiation of (11) that (0F/0b) = 0 at ¢ = 0 for any 6, but 
that (0F/db) = 0 at t= © only if b > 0 and, in fact, that (0F/0b) in- 
creases without limit when 6 = 0 andi— o. 

Statements IV and V are merely examples of the kinds of properties 
which can be deduced directly from the differential equation, (1), by in- 
voking only general properties of x(t). A special case of V was proved by 
Rapoport (loc. cit.) who considered x(t) = at where a is a positive con- 
stant. There are however a variety of functions which satisfy «(0) = 0, 
x’'(0) # 0. From (29) higher derivatives can be obtained systematically 
and it is here conjectured, without proof, that if F(0) = «(0) = «®(0) = 
... = #%)(0) = 0 and «(0) # 0, then F(f) = «(0) #4/(R + 1)!4+ 
(higher terms). 

It is to be emphasized here that the expansion of F(#) about the point 
¢ = 0 can be written directly from (1) by obtaining higher derivatives in 
the manner in which (29) was obtained. It is not likely that the compari- 
son of coefficients in the expansions of F(#) and Fo(é) will be very informa- 
tive. However, by working directly from (1), the expansion of F(#) can be 
obtained with minimum commitments as to the nature of x(é). On the 
other hand, a frontal attack on the series solution of (1) requires that «(é) 
be prescribed. In particular if «(é) is not a polynomial its series represen- 
tation must be available. 

Case wv: F(0) = 0, x(t) = at, b # 0. Because this case was considered 
in some detail, in terms of the series solution, by Rapoport (Joc. cit.) and 
because it demonstrates that the integral y(#) cannot be obtained in closed 
form in some relatively simple cases, we record this case explicitly here. 

In view of (8), (9), and (10), (11) now reads 


8? e— (at +B)? 


PO Sine oe eee } 
0 1— VrB#'(o(at +8) —o(6)} GD 
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where a = Va/2, 8 = b/V2a, and 


6 (2) = feeds 


is the normalized probability integral. From (31), F() may be computed 
using tabulated values of ¢ and its derivative. 
Remarks. There are various ways in which (1) may be extended or generalized. The coeffi- 


cient, b, in the social imitation component, bF, may be a function of f. If this function is hit), 
then (11) is valid if b¢ is replaced everywhere by 


feds and HH = fhe) Vode. 
0 0 


Certain properties in I-V may not then follow unless some conditions are placed on h(t), but 
the extension is straightforward. A more important generalization results when the threshold 
exceeds unity. Then, as Rapoport (Joc. cit.) has shown, «(¢) is replaced by a more complicated 
function and 6 is replaced by a similar function. The modification of (11) is obvious and, in 
fact, F(z) is always given, provided the imitation component is linear in F(t), by 


F(t) 14g Gin aan (32) 


where K is as defined before; h(¢) is a constant (v7z. b) if b is constant and the threshold is unity 
but is otherwise a prescribed function of #; 


— : ae é 
vO =f h(g) evedE 
with 


U () = frases fawas, 


where G(t) is x(¢) if the threshold is unity but is otherwise a prescribed function of «(¢). 

It is possible that the above form for F(t) will be of use in the fitting problem, which is 
certainly difficult and has only been briefly referred to by Rapoport (Joc. cit., p. 164). For ex- 
ample, if b is constant and known, the determination of x(t) from (11) is not simple. But by 
integration of (32), W(¢) is given by 


¥@) = K1L— e804) (33) 


where S(t) is the area, up to time ¢, between the asymptote F() = 1 and the observed curve 
F(t). In the (more likely) event that x(t) is known but 6 is constant and unknown, b may be 
determined as follows: Differentiation of (33) gives 


t 
e-0() = K (1—F) eb or eae 


in which g(t) is known, F(é) is observed, and the area under the F(¢) curve is easily obtained. 
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Some principles of information theory are utilized in the design of neural nets of the 
McCulloch-Pitts type. In particular, problems are considered where signals from several neu- 
rons must pass through a single one, thus resulting in a ‘‘bottleneck”’ in the flow of informa- 
tion, an abstract model of the corresponding bottleneck from the retina to the optic nerve. 
The first part of the paper deals with a construction of a McCulloch-Pitts net in which the 
redundancy in the messages originating in two neurons is utilized so that the messages can be 
sent over a single neuron with little loss of information. In the second part, messages from a set 
of neurons are ‘‘pumped”’ into two channel neurons. The optimum connection scheme is 
computed for this case, i.e., one resulting in a minimum loss of information. Possible biological 
implications of this approach are indicated. 


The development in recent years of the mathematical theory of com- 
munication, based on the concept of “‘amount of information,” has raised 
the question concerning the applicability of these concepts to the func- 
tioning of the nervous system. D. M. MacKay and W. 8. McCulloch, for 
example (1952), have compared the limiting information capacity of a 
neuronal link, operating on the principle of pulse-interval modulation, 
with one operating on the principle of binary modulation and have shown 
that on the basis of assumptions reasonably applicable to neurons, the 
latter link is capable of transmitting information at a considerably greater 
rate. 

In this paper we shall be particularly interested in links of the many-one 
type, that is, where the outputs of several neurons are “pumped” into a 
single one. In general, of course, such a “bottleneck” channel will be 
expected to be characterized by loss of information. We wish, however, to 
examine this problem quantitatively and to study the connection between 
the coding systems operating and the amount of loss incurred. The abstract 
problem stems from the observation of the well-known “bottleneck” in the 
visual system (Maximov and Bloom, 1938) and has been approached in 
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different ways by others (Landahl, 1939; Culbertson, 1948). An attempt 
will be made to apply some principles of information theory to the 
problem. ; 

In order to examine a portion of the nervous system as a communica- 
tion-iransmitting device, one needs to know the structure of that portion. 
This can be studied to a certain extent by anatomical methods. But the 
knowledge of the structure is not sufficient. One needs to know the “coding 
system” by means of which stimulation from the outside is transformed 
into signals, the signals are transformed within the system and finally de- 
coded. One also needs to know something about the nature of the source 
from which the stimuli originate. The determination of the nature of the 
signals traversing the nervous system is presumably a task of physiology, 
but it is not clear how one can go about studying the “‘nature of the 
source” from an information-theoretical point of view. In the kinds of 
communication systems studied in information theory one deals with a 
well-defined universe from which the messages are selected, say, sequences 
of letters occurring in some written language. The relative frequencies of 
these messages and their statistical interdependence determine the ‘‘en- 
tropy” (amount of information) produced by the source (per unit time or 
per unit message). This quantity must be known before anything can be 
said from the information-theoretical point of view about the functioning 
of the communication device, say, the capacity of its channels or the 
efficiency of various coding systems. As we shall see, we cannot determine 
the entropy of the visual field itself (presumably the source of visual 
signals). Our ‘‘source”’ will have to be the totality of messages originating 
in the first line elements of the visual apparatus. 

Considerable detailed knowledge has been obtained about the structure 
of the visual system of primates, described, for example, by S. L. Polyak 
(1941). The visual system certainly resembles an intricate communica- 
tion system. It has sufficient regularity and consistency (for example, in 
the typical systems of connections characteristic of different types of 
morphologically distinguishable cells) to warrant a hypothesis that in- 
formation is passed along the system and transformed in the process in a 
rather definite manner. That is to say, it is likely that different types of 
cells in the visual system respond to stimuli in ways characteristic for 
them. But this is coding. Possibly by refined physiological methods some 
of this coding can be deciphered. But even the most detailed knowledge 
of this sort still tells us nothing directly about the nature of the “real 
source,” that is, the universe of stimuli impinging on the system from the 
outside. There seems to be an insuperable barrier to obtaining this knowl- 
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edge, because one is at a loss what to consider a primitive signal and how 
to approach the problem of studying the statistical correlation of such 
signals. One is forced, therefore, to take a later stage in the course of 
events as the “‘source.”’ After a light signal has passed through the cornea, 
the aqueous humor, the lens, and the vitreous humor (possibly being al- 
ready to some extent transformed in the process), it activates, presum- 
ably, photoreceptors in the retina. It is the activity of these photo- 
receptors which will be considered as our “source” of information, ice., 
the retina is taken to be the “transmitter” in the communication scheme. 

Even though the activity of the retina is capable of being described in 
much more precise terms than the events in the visual field, we still can 
say very little about the nature of this “source.’’ We don’t know, for ex- 
ample, how the activity of one receptor is correlated with that of another. 
However, we can at least specify what we mean by a primitive signal. We 
can define it as the state of activation of an individual photoreceptor. If 
we now schematize the photoreceptor layer in the retina as a mosaic, as- 
sume the operation of the all-or-none law, and take for our unit of time 
the shortest refractory period of all the receptors, then within each unit of 
time the state of this layer of the retina can be specified by a particular 
configuration of the activated (firing) receptors. 

If each state is counted as a possible “‘message,”’ the number of such 
messages is 2”, where m is the number of receptors. If the firings of the 
photoreceptors were all independent of each other, and if in a certain 
interval of time the probability of firing of the 7th receptor were p; and 
if we set g; = 1 — ;, then in that interval the rate of entropy production 
in the retina would be (Shannon and Weaver, 1949) 


H= » (p; log p: + gi log qi) bits/message . * (1) 


t=1 


In particular, if p; = 4 for all i, H = n bits per message. If we take the 
shortest refractory period of all the receptors to be our unit of time, each 
receptor can fire only once or not fire at all in each unit of time. Therefore 
we can conceive of the messages (configurations of firing) to be sent at the 
rate of one per unit of time. If m = 10% (Polyak, 1941) and the refractory 
period about 10-* seconds (Fulton, 1949), the independence of firing at an 
average frequency of once per two milliseconds would give us a rate of 
production of information in the retina of 10" bits per second. This result 
is quite fantastic in view of the experimentally obtained orders of magni- 


* Throughout this paper logarithms are taken to the base 2. 
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tude for the rates at which organisms are able to respond to information 
or store information. 

This picture is considerably modified if we conceive of the retina as 
composed of several regions, which transmit not the precise firing con- 
figurations as messages, but only the amount of activity within them. 
That is, the nwmber of photoreceptors firing in each region constitutes a 
“message.” 

To fix ideas, suppose this is true for the entire retina. Only the “bright- 
ness” of the visual field is perceived, not the pattern. We will first treat 
the case where p = 3 for all receptors. The average “brightness” will be 
the expected number of activated neurons. However, the successive mes- 
sages will vary because of the fluctuations from this average. Since the 
probability that at a particular time exactly 7 photoreceptors will fire is 
given by 2-"(7), we have for this case 


pa S2- (4) oel2(7)]- (2) 


Using the well-known approximation 
n Y) 
Q-n ws 0p (n—-21) 2/20 
7) V7 r : (3) 


and replacing 7 by the continuous variable «, we rewrite the sum (2) as an 
integral 


2 n 
A= —- —== if — (n—2z)2/2n 
VIxrn J, e 


2 : 
Xlog| >= ae pelea lich: = —log 2+3 log 2x (4) 


pee — (n—22)2/2n (m— 2x)? 
nd ied os 2x) ?/2 ris: log edx. 


TH 


It remains to evaluate the last integral.* 
The substitution 


(n — 2x)? 
ae rere Se 
transforms the integral into 
log e s” 
ee ertyvady., (6) 


< F 
: os account is taken here of the actual coding in the visual system, for example, of the 
act oe photoreceptors respond with certain frequencies of firing to brightness or, more fre- 
quently, to vartations of brightness, of accommodation phenomena, etc. The strict binary 


modulation coding is taken to be a model throughout, in order to deduce the implications of 
this particular assumption, 
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which, already for moderately large n, becomes 


1 
ee log el (8) = log ec. (7) 
Hence 
H=log V2nne—1. (8) 


The approximation holds to three places for n = 17 and, of course, be- 
comes progressively better as m becomes large. For very large , we can 
safely use the formula 

H=%3logn. (9) 


The generalization for arbitrary is straightforward. Equation (4) is 
replaced by 


H= —— == e (np—a)?/2n lo beeg oe i ieee extn) dx (10) 
0 WV2rnpq 6 V2xnpq : 


and leads to 
H =logV2renpq. 


Now if the retina is subdivided into & independent regions of /k photo- 
receptors each, where the average firing frequency in each region is p;, the 
amounts of information produced in the regions are additive, and we have 


a Ei eee (11) 


We may consider & as a measure of “acuity’’ required (analogous to the 
fineness of mosaic of a television screen or the resolving power of a micro- 
scope). Hence H appears as a function of the acuity and of the distribu- 
tion of intensities p;. N.B.: Formula (11) does not reduce to (1) fork = n, 
because the approximations used in the derivation presuppose that 1/k 
is fairly large. 

The above considerations apply only where the firings of the indi- 
vidual photoreceptors are independent of each other. Such independence 
implies an assumption that the visual field is a perfectly randomized 
kaleidoscope. As a matter of fact, however, the visual field is much more 
organized. It should be described rather in terms of a more or less static 
background, against which more or less rigid figures are moving. There is 
thus an enormous amount of interdependence among the successive “pic- 
tures” of the visual field. This interdependence means a redundancy in 
the source, so that the retina is producing information at a rate much be- 
- low those calculated in the complete acuity case, and even in the “total 
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brightness” case, where 5 log m ~ 12, giving an upper limit of 12,000 bits 
per second. Moreover, the pattern of the picture has been completely 
thrown away in the latter case. 

One feels certain that the redundancy of the visual field is the crucial 
factor enabling us to get the details of the field at a cost far below the 
tremendous rates of information transmission required in the kaleidoscope 
situation. It is quite likely, to be sure, that much information is lost as 
the signals travel over the pathways in the visual system. The interesting 
question is what is thrown away and what is kept, and no less interesting 
a question is what codings are used to take advantage of the redundancies 
in the source, so as to push information through the “bottleneck.” Such 
questions were raised, for example, by W. A. H. Rushton (1950). 

The bottleneck for which our models will serve as prototypes is ob- 
served in the optic nerve. Some 108 photoreceptors map (through inter- 
nuncials) on some 10° ganglion cells (Polyak, loc. cit.), whose axons carry 
the impulses to the brain. 

It is sometimes pointed out that inasmuch as the cones of the central 
fovea are connected in one-to-one fashion to the channel neurons (the so- 
called ‘“‘midget’’ ganglion cells) and inasmuch as these cones seem to be 
involved in the perception of detail (Polyak, Joc. cit.) that there is really no 
“bottleneck problem”’ at least with respect to acuity of vision. However, 
we are not here concerned with the specific mechanism of detail vision, but 
rather with the over-all characteristics of the visual system as a com- 
munication-transmitting device. The fact remains that the ratio of all 
photoreceptors to all channel fibers is about 100/1. To say that the many- 
one connections are not involved in acuity vision is to say that some 
information is thrown away. We are here simply raising the following 
question: Given a many-one type of connection, how much information 
needs to be thrown away under what circumstances? It is preliminary to 
the above-mentioned question raised by W. A. H. Rushton (loc. cit.) con- 
cerning how much information is actually lost in the visual system and 
under what circumstances. 

If there is sufficient redundancy among the messages originating in 
the photoreceptors, it is not inconceivable that the information can be 
pushed through this bottleneck with very little loss, provided the re- 
dundancy is taken advantage of by proper coding. The coding is presum- 
ably built into the structural and functional characteristics of the neural 
net which constitutes the retina. We might guess at the possible char- 
acteristics of that coding by examining the structure of that net. Or we 
can construct hypothetical “neural nets” to do specific coding jobs and 
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hope to learn something from that procedure. In the construction, we 
must make use of units and assemblies which resemble neurons. The 
simplest model of a neuron is the McCulloch-Pitts-von Neumann type of 
binary relay (McCulloch and Pitts, 1943), and accordingly we will take it 
as our unit. Our first problem will be the following. Given a set of neurons 
producing information at a given rate and a channel (another set of neu- 
rons capable of transmitting information at a certain rate, which is al- 
ways taken to be the maximum possible for that set), to construct a cod- 
ing system (consisting of internuncials of the same type) which would take 
advantage of the redundancies of the source. 

The simplest case of redundancy is one where the messages are of un- 
equal probabilities. Thus if two independent neurons are firing, each with 
probability of only 1/10 per unit time, the four messages which can issue 
from this source at each moment have respectively probabilities .81, .09, 
.09, and .01. The amount of information produced is not the maximum 
possible (2 bits per unit time), but only 


81 log .81+.18 log .09+.01 log .01 = .94 bits/message. (12) 


Now given a channel consisting of a single nerve fiber and, therefore, 
capable, according to our assumption, of transmitting at the rate of one 
bit per unit time, it ought to be possible to pump the information pro- 
duced by the two neurons through this channel without loss of informa- 
tion and without backlog. In the interest of simplicity, let us see what is 
involved in designing a coding net to do so. We shall, of course, use only 
“neurons” as the units of our apparatus. Quite a number of such neurons 
may be required, but it may still be worth while to have the coding net, 
inasmuch as the coding may be done “‘locally.” The saving is presumably 
in having only a single Jong fiber to carry the information to a distant 
destination. 

We shall first utilize the theorem of C. E. Shannon (1949) in construct- 
ing a code. According to this theorem, one maps all sequences of messages 
whose total probability comes closest to 4 on the symbol “0” (taken to be 
“non-firing” in our case) and the remainder on the symbol “1” (firing). 
One then subdivides these groups of sequences into subgroups, again of as 
nearly equal probabilities as possible and maps them on the four two- 
digit binary numbers, 00, 01, 10, and 11, proceeding in this way until there 
corresponds a binary number to each of all possible messages of some 
arbitrarily chosen length. The greater this chosen length of messages to 
be coded, the more messages there will be to code and, therefore, the more 
complex will be the coding system, but also the more efficient, in the sense 
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that the redundancies of the source will be the more fully exploited. This 
principle will be exemplified in the particular case we have chosen. Since 
the source described above produces information only at the rate of .94 
bit per unit time (instead of the maximum rate of 2.0 bits), it contains 
considerable redundancy. If we code the four messages of unit duration, 
we will have taken advantage of some of this redundancy but not all. In 
fact, we will be able to transmit at the average rate of a message per 1.3 
units of time, which is better than a message per 2 units, as would have 
been the case if the entropy of the source were maximum, but not as good 
as a message per .94 units which is the limiting rate according to Shan- 
non’s fundamental theorem (Shannon and Weaver, loc. cit.). Let us see 
how this comes about and how the rate can be improved. 

Using the coding system described above, we have the following code 


00-0 
01.410 
10-110 


(13) 


Lt tela 


Here the left side represents the four messages of unit duration, which are 
the simultaneous firings of the two source neurons. The right side repre- 
sents the consecutive firings of the single channel neuron. The average 
length of the coded message will be 


o1,X 1 + .09 X 2.-+..09 X33 + OL X 3 =31:29 units. (14) 


Since the length of the sent messages is 1 unit, there will be a “backlog”’ 
of 29%, i.e., the messages will pile up at the rate of a message per 3.4 
units of time. 

We can improve the situation if we use a more complicated code, i.e., if 
we map messages of 2 units duration. There will be 16 such messages rep- 
resented by binary numbers of four digits. The associated probabilities 
will be .6561 for the message 0000, .0729 each for messages with three 
zeros, .0081 for those with two zeros, .0009 for those with one zero, and 
-0001 for the message 1111. Applying our optimal coding procedure we 
find that we can now transmit at the rate of a (double) message every 2.3 
units of time. Since the messages are produced at the source at the rate of 
a (double) message every 2 units, there will be a 15% backlog, which is 
a marked improvement over the preceding case. 

By coding longer and longer messages, we can improve the situation 
indefinitely, actually eliminating the backlog altogether in this case, be- 
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cause the rate of production of information at the source is actually less 
than the channel capacity. But, as we can see, this can be done only at 
the expense of more and more complex coding systems. To code messages 
of & units duration, we must code 4* different messages. An interesting 
theoretical question is what is the minimum value of k which will elimi- 
nate the time lag completely (where it is possible to eliminate it). One 
suspects that this minimum & is a function of the difference between the 
entropy of the source and the channel capacity, but possibly depends on 
other factors too. Another interesting question concerns the relation be- 
tween the efficiency of the coding and the number of internuncials re- 
quired for the coding net. 

Let us now take a particular coding system, say, using messages of 2 
units’ duration and construct an automaton (a McCulloch-Pitts net) to 
transmit the messages from two neurons over a single fiber. A backlog of 
un-sent messages will be accumulated. There will, therefore, have to be 
an information storing device. But since its capacity will be finite, some 
information will have to be thrown away now and then. 

The best possible coding of 2-unit messages is as follows: (0000) — (0); 
(0001) — (1000); all other messages on five digit numbers. The reason it 
is not possible to use four digit numbers for all non-zero messages (as 
would seem sufficient for 15 messages) is because the use of a single digit 
message (0) for the most frequent message (0000) makes the coded mes- 
sages of unequal length. Hence ambiguity cannot be avoided unless a 
“space’’ is used. But this introduces another signal of which our “neu- 
rons” are not capable. The introduction of five digit numbers removes 
this difficulty. (Cf. Shannon and Weaver, loc. cit., p. 32, where three digit 
numbers have to be introduced to code four messages.) 

We can simplify our coding net at a small expense of transmitting time 
if we forego the privilege of having one four digit number (1000), thus 
making all but (0) five digit numbers. (This slows down transmission to 
2.37 units per 2-unit message, a loss of 18.5%.) We accordingly code as 
follows 

0000-0 
1000-11000 
0100-10100 (15) 


0010-10010 


Caley te 1: 8: 


and so on, each double message four digit number mapping on a five digit 
number obtained by adding 1 in front. The “1” thus acts:as a space, the 
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(0) is the only coded message not starting with ‘1 and thus needs no 
space to differentiate it. 

Thus it will take five time units for the channel fiber to transmit a 2- 
unit message for all such messages except the most frequent one (0000), 
for which it will take only one unit. The backlog messages will then be 
piling up as long as messages other than (0000) are sent, but will be re- 
duced by one message every time (0000) is sent. 

To take care of the backlog, we will construct a simple storage device, as 
shown in Figure 1. 

It is easily seen from the figure that the neuron pair (A;, B;) is emitting 
the message sent by the neurons (A, B), which are source neurons, with 


@-@-O©-© O-O-O—: 


FIGURE 1 


a delay of 7 units. Thus the message sent by (4, B)7 units ago is still 
available for transmission for 7 < /. Thus information produced by (A, B) 
can be stored for / units of time. 

We next construct a “scanning” device, by means of which the double 
message aByd(a, B, vy, 6 = 0 or 1) sent by (A, B) is mapped upon the 
channel neuron C as four single digits. This device is shown in Figure 2. 

The essential feature in the scanning device is a cycle of 27 neurons 
Qi, Qi, Qe, Q2 . . . Q;, Q), which is self-reverberating. We shall refer to it 
as the “clock.” If Q; fires at the moment ¢ = 0, the consecutive Q’s will 
fire in turn until Q; fires again at ¢ = 21. 

The neurons a and @ have threshold 2. From the structure of the net, 
it is easy to see that for 0 < ¢ < 21, 


ao can fire only at ¢ = 1 if, and only if, A has fired at t = 0 


? 


Bx (73 iz ce cc t = 2 (a3 ce co Ge B ce ins “ce t = 0 ; 

a4 (a9 it (73 cc t — 3 a3 a3 cc oC A cc ce iss fe = 1 , 

Be ch 4c a3 “ t = 4 a3 73 co 66 B ce (73 ifs i = 1 
. 


Now if all the a and 6 neurons are connected to our channel neuron C 
(of threshold one) we see that the 2-unit messages aByé (aByd = 0 or 1 
but not all 0) are coded onto non-zero, four digit binary numbers duping 
the interval 0 < # < 2). 

However, our coding of these messages must map them on five digit 
numbers, each beginning with a “1.” We must thus modify our net so that 
C will fire automatically every fifth moment except when the message is 
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0000. Neglecting for a moment this exceptional case, let us make C fire 
automatically every fifth time. This can be easily accomplished by an- 
other self-reverberating circuit of five neurons as shown in Figure 3. This 
circuit will be called the “spacer.” 

The “spacer” creates another problem. Since 5 is odd, the automatic 
firing of C will throw the “scanning” out of phase, since one digit will be 
skipped by the automatic progression of the clock, and the firings of A 
will be interpreted as those from B. This can be corrected by a device 


ees | aa 82) (83) eee 
«) «,) @,) é,) @3) 


FIGURE 2 


which will effectively “switch” the connections so that if originally the 
Q neurons in Figure 2 were connected to the a’s and the Q’ neurons to the 
6’s, they will after the ‘“‘switching’’ be connected vice versa. Such a 
switching device is shown in Figure 4. 

It consists of two self-activating neurons S, and S; which, once acti- 
vated, fire continuously until inhibited.* 


* The fact that such single neuron cycles are excluded by physiological considerations is 
irrelevant for our model. To begin with, they can always be replaced by larger cycles if another 
unit of time is chosen. Furthermore, the model makes no claim for physiological verisimilitude. 
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Neurons S:, S3, X, and Y have threshold 2 each. Suppose now Sy is 
firing but not S;. This means a firing of W will fire X but not Y. Hence W 
is ‘effectively’ connected to X but not to Y. Now suppose 5S; fires, thus 
(because S, is firing) activating S. but not $3. Then, through its in- 
hibitory connection to S, (dotted), S, will extinguish S, and through its 
excitatory connection to S; will activate S;. The situation is now re- 


FicureE 3 


FIGURE 4 
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versed: W is effectively connected to Y but not to X. We thus have a 
“switching” device: the firing of 5, switches the effective connection of W 
from X to Y, and, by symmetry, the next time S; fires, the connection will 
be switched the other way. 

If we now include the “‘spacer’’ and the “switcher” into our net, where 
the spacer automatically excites the channel neuron C every fifth moment, 
and the switcher shifts connections from the neurons Q and (’ to a’s and 
B’s alternately, we have accomplished the following coding: each 2-unit 
message is mapped on a five digit binary number in C. 


Nm 


a 


Ficure 5 


It remains to take care of the message (0000). This is a special message; 
because it maps on the single digit (0) and should move the clock Q back 
by a unit of time, since the original message takes two units and the coded 
message only one. 

We will first describe a neuron which will fire if, and only if, neither 
A nor B fires for two consecutive moments. Such a neuron is shown in 
Figure 5. In this figure C’ ‘“‘scans’” A and B in the way described above. 
Hence C’ responds to a 2-unit message from (A, B) by a four digit num- 
ber. This number is “preserved” on the chain (Ni, Nz, N3, N4) which 
exhibits the number simultaneously. The neuron WN has threshold one, 
hence will fire if, and only if, the four digit number is not 0000. Moreover, 
N is synchronized with a clock q, so that it can fire only once every four 
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units. Finally N inhibits Z, which is self-activating. Hence Z responds 
only to (0000) from C’ and thus to the message (0000) from (A, B). We 
shall call Z the zero neuron. 

Finally, Z must set the clock Q back by a unit of time, since this time 
has been gained in transmitting a 2-unit message in one unit of time. A 
simple clock-setting-back net is shown in Figure 6. The “clock” is the 
cycle (1, 2, 3). The neurons a, , c, have threshold 2, hence one of them 
fires only if Z and the corresponding neuron of the clock fire together. For 
simplicity it will be assumed that a, b, and c have no synaptic delay, L.e., 
fire simultaneously with the neurons which excite them. The situation 
where there is synaptic delay is not essentially different: it merely post- 
pones the setting of the clock back for a unit of time and necessitates a 


somewhat more complicated net. After 1 has fired, it is 2’s turn to fire. 
But if Z (and therefore a) have fired simultaneously with 1, 2 is inhibited, 
and 3 excited instead. But 3 on the clock is behind 1. Hence the clock has 
been set back by a unit of time instead of marking the next unit of time. 

Our net is now complete. To summarize: it consists of a “storage” 
(Fig. 1), a “scanner” (Fig. 2), a “spacer” (Fig. 3), a “switcher” (Fig. 4), 
a zero neuron (Fig. 5), and a clock-setting-back device (Fig. 6). The stor- 
age keeps the backlog messages. Since these pile up at the average rate 
of about a message per five units of time, the J units of the memory stor- 
age become filled on the average once every 5/ units of time. This is also 
the average of time of the revolution of the clock Q, which moves forward 
with each non-zero message and backward with each zero message. When 
the clock makes a complete revolution, the scanning is started at (A, B,) 
again, i.e., the backlog messages have been thrown away. In other words, 


A THEORY OF VISION 29 


the 18.5% lag in the transmission results in throwing away of 18.5% of 
the messages. 

The foregoing scheme certainly appears as a tour de force and is not 
offered as a model of any actual neural mechanism. It is described merely 
to show what may be required to effect the “spatial compression” with 
neurons of the McCulloch-Pitts type. 

We will now take a different approach. Our problem now is not to ef- 
fect a compression without any serious loss of information by utilizing a 
complicated coding device, but simply to indicate the optimum connec- 
tions in a “bottleneck,” which would minimize the (inevitable) loss of 
information by utilizing some principles of information theory. 
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FIGURE 7 


In particular, consider a set of m neurons mapping upon 2 channel neu- 
rons. The connections are effected either by the branching of the axons 
or by dendrites of the channel neurons (Fig. 7). At any rate, in the sym- 
metrical case the connecting links fall into two classes: 2m» links leading to 
only one of the channel neurons, and m leading to both (2no + m = n). 
Given the firing frequency # of the m neurons, we wish to determine the 
optimum m (or m), i.e., the partition of the m receptors into the two 
classes described so that the loss of information shall be minimum. 

Our point of departure is Shannon’s equation (Shannon and Weaver, 


loc. cit.) 
H («) + H.(y) =H (y) + By (%) (16) 
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where H(x) is the entropy of the source, H,(y) the uncertainty of the 
message transmitted, given the message sent, H (y) the entropy of the 
channel, and H,(x) the uncertainty of the message sent, given the mes- 
sage transmitted. If the channel is noiseless, then Hey) =O i ele ne 
message sent is known, the message transmitted is certain. Such is the 
case in the particular example we have chosen. But we see that our 
channel is “singular,’’ i.e., several messages sent may correspond to a 
single message transmitted. Hence H,(x) > 0. But this term is a measure 
of the information lost in the transmission. Rewriting equation (16) 
without the noise term, we have 


H (%) = Hy) + Heke) (17) 


Hence the problem of minimizing information loss (i.e., minimizing H,(«)) 
reduces to the problem of maximizing channel entropy, H(y). We there- 
fore wish to find a net from a given class of nets (i.e., the partitions of the 
neurons into singly and doubly connected ones) which will maximize 
H(y). We shall solve the simplest case where the threshold of both channel 
neurons is one. 

The two channel neurons can send the following four messages: (11), 
(01), (10), and (00). Their probabilities are given by the following ex- 
pressions, where g = 1 — p (the probability of not firing): 


PAUL SD: = igs eg, 
P(0, 1) =P(1, 0) =gr™—¢@, (18) 
PO0) =o. 


Hence the entropy of the channel is given by 


—H= (1 a 2 qr" + q”) log (1 _ 2 qr + q”) +2 (qr — gq”) 


(19) 
A lOg aCe ds 
Differentiating with respect to 7», we obtain 
dH m—n nm 
Sires 4 m—Ny E q pa!) 
dn Z qg log q log T—-2g +e . (2 0) 


Now the maximum value of m» is 2/2. For that value 


a5 — mn Li @ 
Ty, 7 29" log cle ae (21) 


A THEORY OF VISION Sl 
Expression (21) is positive if 
ic gr/? <a gr/? or gn/2 > k (2 2) 
Hence for firing frequencies sufficiently small it pays to cross-connect. 
The best cross-connection is given by 7%, the value of m in the equation 
Gants — gq” = i =: D GfPe9 a gq” 5 


Cie Belk (23) 
aes 


That is, 


er log (3 q") —log (1+ 29") 


) 
log q (24) 


It is interesting to compute the limit of mj as g approaches unity, i.e., 
as the frequency of firing becomes very small. Applying Héspital’s rule, 
we have 

Tin log (3 q") — log CLF ZO”) 
qi log q 


n 


This result is intuitively evident. For very small firing frequencies we 
are dealing essentially with a situation where the receptors fire at most 
one at a time, simultaneous firings being vanishingly rare. In this case, 
if we connect § of the receptors to one channel neuron, } to both, and 3 to 
the other, we at least can place the firing neuron into one of three groups 
of equal size, which is our best guess under the circumstances. (No cross- 
connections would place it in only one of two groups.) 

Several generalizations of the problem immediately suggest themselves. 
One can, for example, introduce thresholds greater than one for the chan- 
nel neurons. Then the mathematics is immediately complicated, since in 
that case the equations determining 7; become transcendental. However, 
the application of the method is entirely straightforward, and nu- 
merical results can be obtained in any specific problem. One can next 
“stagger” the responses of the channel neurons. Then the information 
capacity of the channel increases considerably, since the order of firing 
has been introduced. The distinguishable responses then become 


COO) (Onda), COz15), (1102) 5.1900)» Carta), Coli); 


where the subscripts identify the neurons and the order in parenthesis is 
the time sequence of their response. This gives a maximum channel 
capacity of log 7 = 2.8 bits per message, instead of 2 bits. To effect the 
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staggering, and to take advantage of it, internuncials must be intro- 
duced, which complicates the coding apparatus. 

However, it must be pointed out that the method which begins with 
the consideration of the simplest possible ‘‘bottleneck” nets and proceeds 
to the study of nets complicated by the introduction of a few strategically 
placed additional elements or by simple modifications of given elements 
suggests the evolutionary outlook. It is impossible to envisage the evolu- 
tion of a net with specifically designed “spacers,” “clocks,” etc., such as 
we have described earlier, without invoking specific purpose in the 
design—an idea foreign to the most readily acceptable evolutionary 
ideas. On the other hand, the evolution of nets which could conceivably 
arise through accidental modifications is in line with the natural selection 
principle. According to this principle, each new step in the gradual de- 
velopment must be characterized by some “advantage” to the organism 
and must at the same time be a “‘simple”’ step, conceivably the result of a 
mutation. Such steps can be. imagined if some over-all quantitative char- 
acteristic of the net can be affected by each of them. We suggest that this 
characteristic can be taken as the efficiency of the net as an information- 
transmitting device. For example, a simple increase or decrease in the 
number of overlapping connections in our second model has a definite 
quantitative relation to the amount of information lost in the bottle- 
neck. Such changes can be readily imagined as being governed by a 
genetic complex. Therefore, they can be postulated as ‘evolutionary 
events.” The same is true for thresholds and for more or less haphazardly 
introduced internuncials, which may effect a “staggering” of response in 
the channel neurons, mentioned above. Once such internuncials appear, 
they connection schemes may become subject to natural selection, again 
according to the information-transmitting efficiency involved, including 
the selection of particular kinds of information of importance to the 
organism. Thus, the history and the significance of the bipolar cells in the 
intermediate layers of the retina is suggested. 


This investigation is part of the work done under Contract No. AF. 
19(122)-161 between the U.S. Air Force Cambridge Research Laboratories 
and the University of Chicago. 
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The scattering and absorption cross-sections for the resonant modes of biological cells in 
plane waves are computed. These are both so many orders of magnitude smaller than the 
physical cross-section that the scattering or absorption by a suspension of cells in a plane 
wave should be negligible. 


In the first paper of this series (Ackerman, 1954b, referred to hereafter 
as I), it was shown that radiation and viscous damping limited the possible 
modes of mechanical resonance of biological cells and strongly damped 
the vibration, but had little effect on the resonant frequency. In experi- 
mental work, the problem usually encountered is the response of the 
cell to mechanical vibrations. In this paper, the cross-sections of the 
cells for plane waves are investigated. (The cross-section is a convenient 
method of expressing the coupling of the applied vibration to the modes 
of vibration.) 

As in I, the model used is a cell of spherical shape held together by a 
membrane possessing only an interfacial tension, 7. In part I, it was 
shown that viscosity made it necessary to include rotational (transverse) 
waves as well as irrotational (compressional) ones. The former limited the 
possible resonant modes to those for which / > 2, but had no other effect. 
(The parameter / entered into the resonant mode through the surface 
harmonic P(cos 0)e‘™, where the angle @ is the colatitude and the 
angle w is the longitude.) However, the viscosity strongly damped even 
the compressional vibration. In this and the following paper, only the 
damped irrotational vibrations are considered; however, the discussion is 
restricted to modes for which / > 2. Thus only boundary conditions (1) 
and (6) of part I are necessary at the cell; these express the continuity of 
the normal velocities and the equality between the normal stress on 
the surface and the interfacial tension restoring force. Substituting the 


ie 
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relations of the acoustic pressure and the particle velocity to the scalar 
velocity potential, ®, into these boundary conditions, there result the 
following equations: 


O®; _ 980 (1) 
Or Or 
and 
: O®; ve a 
w? (pi®; — poo) — 1 | i Ae 2) = a4 sin 6 (2) 


F ae 
atua sri) | 


These equations are both valid at the cell surface, 7 = a. (For meaning of 
symbols, see the last page of I.) Since the density, p, sound velocity, c, 
and reciprocal wave length, k/27, are close to the same in both the cell 
and the surrounding medium, the subscripts on these quantities will be 
omitted from here on. 

The scattering cross-section, o,, is defined by the equation: 


II, 
c= Jim 7 @) 
where II, is the total acoustic power scattered by the cell and J, is the 
incident intensity. The computation of o, which follows allows one to 
evaluate the probability of observing these resonances by the scattered 
radiation. 

In contrast to I, the velocity potential outside the cell, 9, must be 
interpreted to consist of two parts, an incident plane wave denoted by 
the subscript . (for external) and a scattered spherical wave denoted by 
the subscript ,. One now writes 


& = & + 4, (4) 


and determines ®, in terms of 9®,. 


An incident plane wave along the polar axis can be represented by 
(Morse, 1948) 


&, =Eei(Kr-et) =Ee- st Si)" (2m-+ 1) jn (Kr) Ph (cos 8) , 
0 


where £ is the velocity potential amplitude and the other symbols are the 
same as in I. This expression contains no terms of the form P4(cos 6)e*” 
except for/ = 0. Thus ®, and ©; will have no steady-state terms with 
1 2 2. In order to obtain a component of the desired form, it will be 
necessary to consider an incident plane wave, at an angle, @., with the 
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polar axis. For simplicity, the wave is chosen in the plane y, = 0; the 
expression for the incident wave now is 


®, = Ke? (Ar sin 6 cosy—wt) 


= oN ee | : < oa! 
= Bee $1" n+) jar) DY(2— 0) BE 6) 


Pr’ (cos 6) P,” (cos 6.) cos my . 


In particular the coefficient of cos 2ye~**, called ®., is given by the 
expression 


@..= —E > qo (Rr) sin? 0, sin? 6 


7 (6) 


k 
= —F£ q sin? 6,sin? 0 if kr<l. 


This will have its maximum at 6, = II/2, and for simplicity this choice 
will be made for most of this discussion. 

In the previous part, equations (1) and (2) were used to determine w for 
free vibrations. The resulting value of w was complex, the real part being 
interpreted as 27 times the frequency and the imaginary part as the ex- 
ponential time rate of damping. It was found that if ®) and ®; were chosen 
to obey the wave equation and to represent no sources other than at the 
cell wall, then the complex values of w formed a uniquely determined se- 
quence. In the present discussion a source is present at infinity giving 
rise to ®,. This source has an arbitrary real frequency w and a steady- 
state solution results. 

The scattered wave must have the same form as the wave outside the 
cell in I, For the lowest resonant mode one may write, the factors 
cos 2We—*#t being understood, 


Bop = Do{ jo(kr) + ime(kr)}P3(cos 8) 
= — Dkr sin? 0, if Rkr<1. 
The motion inside the cell must likewise have the same form as in part I; 


i.e., the second mode has the form 


ker? , ; 
Bo = Coe (kr) Ps (cos 0) ~Cz - sin? 6 if kr<l. 


The value of kr at the cell surface is much less than one so the above 
approximations are all valid there. These expressions are now substituted 
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into equations (1), (2), and (3) and the latter are solved for Cz and Dy 
in terms of E and w. This gives the two equations: 


5 2 pw2a? +i3wa [290 — nil 
2° [5 pw2a® — 367] +i6wa [ni +4701 


CE 


and ; 
a anes 2wa [ni t2no] —7 (3 pw?a — 127) (7) 
a" T8) [5 pw2a? — 367] +i6we [yi +40] © 


The intensity, J, in a plane wave or a spherical wave, if kr > 1, is given 
by the expression 


2 


where.the horizontal bar indicates a time average. The total power, II, 
passing through a spherical surface is 


His) jive sin 6 d6dy, 


where J, is the component of J normal to the spherical surface. The scat- 
tered wave is normal to any spherical surface cocentric with the cell, 
whereas the incident plane wave gives zero total power passing through 
any spherical surface. Using the orthogonal properties of the surface 
harmonic functions, one may rewrite equation (3) as 


o,=lim ORE sin 0 dédy 


Non) |@, | 2 


= jim LIZ! Dal? LC jn er) itn (hr) |) Pn (cos 6) cos ly] ?r? sin 0d Ody 


mo |@, |2 o 


Thus the cross-section can be considered to consist of a sum of independ- 
ent cross-sections; each of these is individually small and rises to a maxi- 
mum when the frequency of the incident wave approximates the resonant 
frequency for that mode. Using the approximation, 


sae Lg 
dim [jm (%) +inm (x) | se ee 


and equation (5), the contribution of the latest mode to the scattering 
cross-section, o 42, is found to be 


eae Ds 48 
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Comparing this with equation (7) shows that co. will have a maximum 
for a frequency w/27 such that 
Some la 


o = \V> 


5 pas’ 


which is the same as the natural frequency of the undamped oscillations 
of such a system (Ackerman, 1951). Using this frequency, and the data 
for P. caudatum used previously, the ratio of o.. to the geometric cross- 
section 27a? can be numerically estimated as 


Os 


27a? eee 


Similarly minute ratios are obtained using red blood cell data. 

‘This ratio must be of the order of magnitude of the ratio of the scat- 
tered intensity near the cell to the incident intensity. Farther from the cell 
the ratio would decrease. Such small ratios completely preclude any possi- 
bility of the observation of these resonances by means of the scattered 
wave. Physically this can be interpreted as a consequence of the weak 
coupling to this mode since the cells are much smaller than wave length 
and since the Q is so low. 

Although the scattered wave is small, it is conceivable that a detectable 
amount of the incident energy might be absorbed at the vibrating cell 
itself. To investigate this possibility consider the absorption cross-section, 
da, defined by 


where II, — Ip is the total decrease in power flowing through the plane 


r sin @cosw=b 


parallel to the original wavefront. Thus a, will be given by 


Melee Ue pe sthe ooh) ay. 
eee. a : sin 6 cos’? y 
Os ee ny 7 ee ee 


As in the case of the scattering cross-section, 04 will consist of an infinite 
set of terms, one for each mode. Since | D.|? « EB, the contribution of 
the lowest mode, ov, is given approximately by 


2m 48 | D, 
K? S\E 


oa. + 


40 EUGENE ACKERMAN 


This also is a maximum, for a frequency close to the undamped resonant 
frequency. Evaluating this numerically, one finds for various observed 
values (e.g., Ackerman, 1954a) 


Car 


isi eee Se 


Although this ratio is much larger than the scattering cross-section ratio 
it is still too small for any possibility of observation. This is illustrated by 
the following example. For mammalian red blood cells the above approxi- 
mation gives o, ~ 10-!"cm.? Suppose that an absorption of 1% of the 
incident energy were just observable; then for a concentration of 10” 
cells/ml, a path length of 10° cm. would be necessary. These results agree 
with those of E. L. Carstensen ef al. (1953), who could observe no absorp- 
tion due to the cell membranes in red blood cell suspensions. 

Although numerical values of these cross-sections have been worked 
out only for the lowest resonant modes, it is clear that even if the other 
non-resonant modes increased the total cross-sections by several orders 
of magnitude these effects would still be experimentally unobservable. 
Likewise one would predict that cell suspensions in high intensity sound 
fields should be undamaged except by cavitation. This also is in agree- 
ment with experimental observations. 

To summarize this discussion of cross-sections in a plane wave, the 
following conclusions should be noted: 

(1) scattering cross-sections are too small for detection of resonances; 
and 


(2) the absorption by the cell membranes is so small that no possible 
effects should be detectable. 
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It is shown that if the mth derivative of a function is positive, and it has a Legendre poly- 
nomial expansion with coefficients, An, then (An)/(2m + 1) 2 (An)/(2n + 1) for n > m, 
This result is applied to the theory of liquid phase transitions. 


In the work by I. Isenberg (1954, referred to as I) on the theory of 
the nematic phase and its possible relation to mitotic spindle structure 
and in L. T. Zimmer (1955) some of the conclusions depend on the magni- 
tude of the coefficients in the Legendre polynomial expansion of the 
function 6(u). Here we prove a theorem relating the relative magnitude 
of the coefficients in a Legendre polynomial expansion to the derivatives 
of the function, and then point out its application to this work. 

1. Theorem. Tf 


Bie) APe@) ce bs 10d) 
n= 


where the P,,(x) are the Legendre polynomials, and if 


eee Gian ate al (2) 
dx™ — 
then 
Am An fae a OR i ea) 


og 
2m-+-1~ 2n-+ 1 


with equality only if the mth derivative is identically zero. 
Proof: First to get an integral relation between successive derivatives 
of the P,(x), taking the mth derivative of the differential equation for 


PX), 


= 0, (4) 


d*P, 
(1-9) 4 
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we get 
ap” wee 6) 
Oe asp —2(m+1)x Weyer se ve 
ax? 
where 
Pp” GHIP. 
(«) 5 dx” > 


Multiplying by (1 — «?)” and combining the first two terms 
F(t at P| [n(n 1) — me (m+ 1)] (12) "P™ = 0, 
from which we have the desired relation 


(1 = x”) matt pleat) (x) x 


z (6) 
— (n= m) (nt m+) f= 2?) "Pe (@) de. 
—1 
From (6), using L’H6spital’s rule, or from (5), we have 
(m+) _ (n—m) (n+m-+1) pm 
which gives, by induction, 
pin) (n+ m)! 
(1) ~ "ml (n—m) 1? for n=m, (8) 
Now to prove 
je? @y p< pl? 1), Mors —1<Ge1 and | wo a) 


we use a device suggested by G. Szegé (1939, p. 160) and introduce the 
function 


$@) = (Pye ee 


(m+1) 
GE nin ee 


Then 
$ (x)= 1A” (x) (10) 
at all points where P’"*” (x) = 0 and also at x = +1, and 
(x) > [Px (a)? (11) 
elsewhere. 
Differentiating and using (5) we obtain 
2 (2m+1) 


¢' (x) = 


(m-+1) 2 
Gam) met en I 
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so that $(x) is decreasing for « < 0 and increasing for x > 0. Then from 
(10) and (11), remembering that 


(m) 


Pe Ne (1) (1) 


it follows that 
(n+ m)! 


(m m 
ue (ry 
2™m!(n — m 


yr? tor — 1 <4 <1 and 1m. (12) 


To express the coefficients, A,, in terms of integrals involving the 
P<” (x), start with the definition 


£.2na41 
ew 
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Integrating by parts and using (6), 
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This integration by parts may be repeated m — 1 times and there finally 
results 


et an) ey (x) (1— 92) ™P” (42) He tessa as 
. i 


Now using (14) and 


we can write 


ee Ane 
24 1 2n + 1 
1 hep) 2) m 2™m!\ (n— m) ! 16m) 
aoe ieee Oe 

Since the integrand here is positive if f‘” (x) > 0, the theorem follows. 

2. Application. Using the terminology and notation of I, we recall that 
w is the potential of the forces between the particles and that 6 is de- 
fined in terms of w by equation (7) of I. Choosing coordinates with the 
polar axis through the line of symmetry of one particle, we have the ex- 
pression for 6(u) as in equation (41) of I, 


B(u) =S lew? — 1] dV = >) DsPi (u) , 
i=0 


where w = w(u, 7) is the energy of interaction of the two particles with 
u = cos 8, r is the separation of the particles, and the integration is over 
the separation of the particles, the angle, 9, being held fixed. 
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Since 
Cf ee dw 
i DY AeA ES: 


ewkT dV , 


then if (dw/du) <0 for all uw, (d8/du) > 0. Now (dw/du) < 0 for all u 
means that the interaction is of the type where the particles will tend to 
orient without having any energy barriers to go over, and in which each 
particle will of have symmetry with respect to its two ends. In this case, 
from (d8/du) > 0 it follows, using our theorem, that 


3 : 
Di> aq i for eee ihe 
From this relation it follows that [see eq. (33) of I] the distribution 


function, f(a), has the form 
An f(a) =14+3AsPi (u) 


in the neighborhood of the transition. 
Zimmer (loc. cit.) has solved the problem exactly for the D; # 0, and 
; = 0 for i # 1. We can now see that this solution corresponds to a 
physically realistic situation, at least in the neighborhood of the transi- 
tion point, when the interaction between particles is of the type described. 
The case of D. # 0, D; = 0 for z ¥ 2, also treated by Zimmer would 
be similarly justifiable if @ were even in w and (d?G/du?) > 0. This 
would follow from 
dw 1 (dw? 
ae aaa) <° 
The characterization of w by this relation is not obvious, but the relation 
might be useful if the dependence of w on u is known. 


The author thanks Dr. I. Isenberg for suggesting this problem and for 
helpful discussions. 
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In connection with a previous paper, an expression is derived for the number of possibilities 
in which distinguishable elements can be distributed into m < n classes, so that each class 
contains at least one element. 


In a previous paper (Rashevsky, 1954) we have outlined a topological 
approach to general biology and discussed a possible geometrical trans- 
formation, which may describe the development of a metazoan organism 
from a hypothetical protozoan. The total number of transformations of 
this type is very large but finite, and one of the problems is to derive 
an expression for that total number. The latter is determined basically 
by two considerations: First, the number of possible distinct ways in 
which  specializable biological functions can be distributed among 
m <n classes of cells; second, the number of total possible ways in which 
indistinguishable points may be distributed among m groups including 
possibilities that some of the ” groups do not receive any points. In this 
note we shall discuss only the first problem. 

The x biological functions are all distinct. If they are divided, in the 
process of the specialization, among m classes of specialized cells, then 
each class must receive at least one biological function, otherwise there 
will be less than m classes. 

We shall denote by R™ the total number of possible different ways of 
distributing 7 biological functions into m classes, and we shall prove that 


i 1 “3 m! i m! ast hes 
Rm = =| m ype eeer ne nee at pee naae 2) ss 


a (MiP ye 
+(- 1) tm | = > oe Lee ie 2) BE 
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We shall also prove that the above expression has the following proper- 
ties: 

a) R™ = 1 for m = m. This should be the case physically, since for 
n = m there is only one way of distributing the biological functions, 
namely, one to each of the m classes of cells. 

b) R® = 0 for n < m, which expresses the fact that there is no way 
of distributing elements into more than m classes if each class is to 
receive at least one element. 


First we prove that expression (1), with the additional conditions a) 
and b), holds for m = 2. 

To distribute 2 distinguishable elements in two classes we may proceed 
as follows: 

We take first one element into one class, and the remaining ” — 1 into 
the other. There are m ways of doing this. Then we take any two elements 
into one class, and the remaining m — 2 into the other. There are 

n! 
2! (n— 2)! 
ways of doing this. Then we take three elements into one class and m — 3 
into the other. The total number of possibilities is 


(2) 


n! 
Bee ole (3) 
We thus proceed until we have m — 1 elements in one class and 1 in the 


other, which can be done in 
n! 


ways. Altogether we find the following number of possibilities: 


! n! 


ud a n! 
treo Gay Bn = ay .. tn are ee (5) 


We thus certainly not only exhaust all the possibilities, but we actually 
have each possibility twice, because to any choice of p elements in one 
class and n — p in the other, there will also be a choice of the same 
nm — p elements in the first class and of the same p in the other. There- 
fore we must divide (5) by 2 and thus find: 


Rp= 2" *—1),, (6) 


which is of the form (1). It is readily verified that (6) satisfies conditions 
a) and b). 


Now we shall prove the following statement: If expression (1) holds for 
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m and also satisfies for m the conditions a) and b), then it also holds for 
A eal 

To enumerate all the possibilities of distributing ” elements into m + 1 
classes we proceed in the following manner: 

We pick out one of the 7 elements into one class, leaving 7 — 1 ele- 
ments to be distributed in the remaining m classes. The number of differ- 
ent distributions of m — 1 elements in m classes is R%?. The number of 
choices of one element is x. Hence altogether we have 


nR™,* (7) 
possibilities. Then we take two elements in one class, which can be done in 
n! 

Zt 2) (8) 


ways, and distribute the remaining m — 2 elements in m classes, which 
can be done in R;, ° different ways. Altogether we have now 


n!} 
21 (n—2)! 
possibilities. Choosing p elements in one class, and the remaining n — p 
into m classes, gives us 


Foie (9) 


n\ Pee 
pimp aa 


possibilities. In the choice of p we can go, however, only up to 

P-n—m, (11) 
which leaves just m elements to be distributed in m classes, for which the 
number of possibilities is R™ = 1. Making p > m — m would make it 
impossible to distribute 7 elements into m + 1 classes. 

If we take the sum of expressions (10) from p = 1 to p = m — m we 
shall have not only all the possibilities of distributing » elements into 
m -- 1 classes included, but we shall actually have each possible distribu- 
tion appearing m + 1 times. For each choice of p there will be a set of m 
classes with corresponding numbers ”; (i = 1, 2,... m) in each. Each 
of these m m,’s will appear asa p in the first class, with p and the remaining 
m —1 n,’s now forming a distribution of m — 1 elements in m classes. 
Thus we shall have m + 1 identical distributions: 


D3 Ny, N2,- - 2 Um 


Ny, ‘Ds No, . ++ Um 
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Hence R",,, is obtained by dividing the sum of expressions (10) by m + 1. 
We thus find: 


n! 


R41 = Te rN ag 
m+ (n— m) !m! 


1 —|? ie She n} SER et fe Rn| (12) 
SSE] i a (n—2)! 2) sine mi. 


Each of the R*’s in the brackets is, according to (1), the sum of m terms, 
and each contains the factor 1/m!. 


Introducing (1) into (12), factoring out 1/m!, adding and subtracting 
terms, rearranging and introducing the notation: 


(Pareenn (13) 


we find: 


Rati = sac >) m? — > (per m* —1 
7) DAG ,) (om = yr SG) om — 1)? Gi aa 


(14) 


17s eae Ge 


Using the binomial theorem we see that the first sums in each line are 


the mth powers of (m + 1), m, (m — 1), etc. Therefore expression (14) 
may be written thus: 


Russ Gp iyl tO" = mw — SO) — 


+ (=O, 7) 8 +2)"— (mk 1)" 


— SG) ety 1]+ (ye ™ Vm b tay" 


— (m—k)*— Ce m1 


a Oey Ges 


p=1 


(15) 
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Consider any two consecutive lines, for example, the &th and the 
(k + 1)th. The second term in brackets of the Ath line together with the 
first term in brackets of the (& + 1)th line give: 


n : 
(—1)**1(m —k+1) ”! (Gage tmp) 


m+1 
CAB are 


ei LE — kh - 1)! 
if 
=(-pe( UTE) ete. 


Furthermore the last terms of each line give together: 


-14+(T)-(3)+-.-- fot B= -1+(7) 


(9 = CD = epee oD 


of (— 1) mao 1 (-— 1) m+2 yy : 


The sum of the first m + 1 terms, by the binomial theorem, is equal to 
—(1 — 1)” = 0; the sum of the last two is equal to (—1)”*? (m + 1). 
Hence expression (15) may be written: 


(m+1)! 9, (m+1)! 


7 1 5 
Re = aeeqyiy mt) aerial 
x (m=1)"— 2. (HD (mth | 


(18) 


ee nw? — (1) S0(5) om je 
sal) ACN DUE: 


Collect all the first terms of each sum in the brackets, that is, the terms 
which correspond to p = 1. They give 


m 


! 
mn —m (m —1) n+ — ara wim — 2) — 


nnn (t( 5 4 (F 1) 5 ee tOM) 


= (1—1)""!=0. 
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Collecting the terms for any other value p, we find: 


n! (m>— m!| (m—1)?+ (m — 2)? 


! m! 
(n— p) |p! (m—1)! (a = 2) 2 


(19) 

ree at (= 1) ™+m). 
According to (1), the expression in the large parentheses is nothing but 
m!R?. But since p < m, and R? satisfies condition b), therefore expression 
(19) is equal to zero. Hence the whole expression in brackets of (18) is 
zero, and therefore: 


ees ve! aoe 
Ra = eal (m + 1) 


n= 


m!1! 


5 Cat) tl 2 
m Cia pmceary angie = 1) 
(20) 


tot (=D™ mt) |, 


which is of the same form as (1). 

It does not follow, however, from the above that R”,, also satisfies 
conditions a) and b). This, however, is easily demonstrated. Expression 
(12) holds formally for any m and n. Though R?, vanishes for p < m 
and is equal to 1 for p = m, yet in all cases it is of the form (1). Put 
n = m-+1 in expression (12). Since conditions a) and b) are satisfied 
for R®, therefore the first term in brackets becomes (m + 1)R™ = 1, 
because of a); all others vanish because of b). Hence 


Re alte (21) 


Now put in expression (12) n = p < m+ 1. Then all the terms in brack- 
ets vanish, because of b). Thus our proof is complete. 


The author is indebted to Dr. Ernesto Trucco for checking the manu- 
script. 

This work was aided by a grant from the Dr. Wallace C. and Clara A. 
Abbott Memorial Fund of the University of Chicago. 
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The relation between the molecular interaction of colloidal particles in solution and the 
formation of an anisotropic phase is examined in two special cases. It is shown that for an 
asymmetrical potential a continuous transition from the unordered to the nematic state is 
possible, while for a symmetrical interaction potential the most stable transition is discon- 
tinuous. 


The biological interest in solutions of macromolecules is especially 
evident for those systems which exhibit particle alignment. In particular, 
anisotropic systems are of significance in studies directed toward the 
understanding of such structures as the mitotic spindle. 

Advances in the theory of solutions of anisometric particles were made 
recently by L. Onsager (1949). One aspect of the theory was extended by 
I. Isenberg (1954) who demonstrated that the nature of the concentra- 
tion dependent transition (at constant temperature) between isotropic 
and anisotropic phases depends upon the symmetry of the interaction 
between two colloidal particles. In order to clarify those general con- 
siderations some special cases are here taken of particles with simple 
interactions and the resultant transitions between ordered and disordered 
states are examined. For the notation and background of quantities not 
discussed here the work of Isenberg (Joc. cit.) should be consulted. 

Beginning with the interaction potential w between pairs of particles 
the quantity 

Bia,a) = J (ew 7— 1) dV 
may be constructed for two particles of orientation a and a’ in a volume V. 
Consider now particles having no center of symmetry. Then the expansion 
of 8 in a series of Legendre polynomials will surely contain odd terms 
and, from a mathematical standpoint, in the simplest instance may be 
represented by the single term 


B(a, a’) = D,Pi(a, a’) 
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where P, is the first Legendre polynomial and D, is a contant. A solution 
of particles characterized by such a function will be shown to give rise to 
a continuous transition of the second kind between isotropic and aniso- 
tropic phases. 

Subsequently the case will be discussed in which the interaction po- 
tential w is evenly symmetrical so that 8(a, a’) = 8(a, —a’). In terms of 
Legendre polynomials the simplest 8 incorporating this property is 


B(a, a’) = DeP2(a, a’) . 


This case is shown to have a discontinuous phase transition. 

It will be convenient to introduce these specializations at the outset 

and let 

8:.(a, a’) =e D,P;(a, a’) R= 12 2 (1) 
Furthermore, of the possible kinds of anisotropy exhibited by solutions 
of anisometric particles, attention will be restricted here to those which 
have a single axis of symmetry or direction of isotropy. Coordinates may 
then be selected with this optical axis as the reference polar axis and 
alignment of the particles may be described in terms of the azimuthal 
angle @ alone, independent of the longitudinal angle ¢. It is to be under- 
stood that the particles have an axis of symmetry by means of which 
their orientation may be specified. 

Now in order to examine the possible changes in free energy of the 
solution of particles it is necessary to determine the angular distribution 
function f(a)dQ giving the fraction of particles oriented about @ in the 
element of solid angle dQ. By the foregoing restriction f(a) = f(@) only. 
A necessary condition that the free energy be stationary is (Onsager, 
loc. cit.) 


In 4rf(a) = v — 1+ cfB(a, a’)f(a’)dQ’ , (2) 
where c = N/V is the number of particles per unit volume and » is a 
Lagrange multiplier. 


Let » = cos 6. Using (1) and the addition theorem for Legendre poly- 
nomials (2) becomes 


In Arf(u) =y-—1 + cDyPx(u) SPx(u’)f(u’)dQ’ . (3) 
With the abbreviation 
Se = cDifPx(u)f(u)d2 (4) 
and the normalization condition ff(u)d@ = 1, the formal solution of (3) is 
etkPR(e) 
a (x) — feeOdg: (5) 


7 
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The substitution of (5) into (4) gives this relation between cD, and Ce 
when the free energy is an extremum 


of est Pk) dQ 
SPr (uw) e&b PR) dQ” 


CDs 


Now the free energy of the solution of particles less that of the solvent 
is (Isenberg, Joc. cit.) 


& = Nuo + NRT(Inc —-1+y) (6) 
where 

boots, (7) 

o = ff(a)ln 4rf(a)dQ , (8) 

g = —SSB(a, a’)f(a)f(a’)dado’. (9) 


Therefore the colloidal solution will be in equilibrium for those values 
of y given by the substitution of (5) into (7). Combining (5) and (8) 


ie Fg as 
o=ff (u) [ es (u) Hae pean: 1 do }da = — In Ky (10) 


where 
1 
E= CID). 1 es pe co a (erp) 
Combining (5) and (9) 
2 
bse — Dike (12) 
gf 
Therefore y from (7) becomes 
ates 13 
=> es In K,, . ( y 


If the solution is isotropic the particles have no preferred direction of 
orientation and f() must be independent of y. From (5) this implies the 
vanishing of ¢%. Therefore, from (5), (11), and (13) the isotropic state 


may be characterized by 


h=0, fi) =qr, Kenly v=0. (14) 


In general, then, it is seen that the colloidal solution may undergo a 
phase transition from the isotropic state whenever y < 0, or by (13) 


2 
See wie Ry SO. (15) 
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It is evident that such a transition must occur with increasing concentra- 
tion of particles. Equation (4) shows that for sufficiently low concentra- 
tion ¢; may be made so small that the isotropic conditions (14) may be 
approximated, while it may be shown that the right-hand side of (5) 
tends to a 6-function with large concentration. Thus the solution would 
eventually become entirely aligned if these expressions for dilute solu- 
tions were valid at high concentrations. It will be shown that phase transi- 
tions do occur for the special cases considered here. 

It should be noted that the condition (15) only shows that a transition 
is possible from the isotropic to an anisotropic state where the free energy 
is an extremum. It must be further demonstrated that in this anisotropic 
state the free energy is actually a minimum. For this purpose the expres- 
sion (5) may be thought of as a trial function for the distribution with ¢; 
as an undetermined parameter. Then the derivatives of ~ may be taken 
with respect to ¢; to determine those values for which the free energy is 
minimum 


Hy Ble 

With the further notation 
PF, = fPx(u)f(u)da, (16) 
Mi = fPi(u)f(u)aa , (17) 


(8) and (9) may be written as 
o = C.F, — In IGS 


— — DF, ) 
so that from (7) 


v= tPe—In Ky— 2 FE, (18) 


Now using (11) and (16) 


dK 1 
Te: SPi (m) ekk?k dQ = F,K;, ord 


so that from (5), (11) and (16) 


d 
“iY PO (20) 
and by (16), (17) and (20) 


dF, 


Th, ee. (21) 
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Since f(u) is a probability density function, then from the definitions 
(16) and (17) it is seen that F;, and M;, are the first and second moments 
respectively of P;(u). Therefore the variance (21) is positive. 
Using (21) and (19), the derivative of (18) becomes 
d 
ae Sry Oe (22) 
df, 
and since the second term on the right-hand side is positive, the deriva- 
tive vanishes when ¢; and &; are related as in (4). With (21) the second 
derivative follows from (22) 


2 
os = (Me — Fy [1 ,.(M,— FF) |. 
hk” Sp=ER FR 


The criterion for stable solutions thus becomes 


Mere. (23) 
Ek 
It remains now to determine at what concentrations the ordered and 
disordered states can coexist. For these equilibrium transitions the 
chemical potentials and osmotic pressures of the separate phases must be 
equal: 

Ne= Ai» Pu=P;; (24) 
where the subscripts a, ; refer to the anisotropic and isotropic states re- 
spectively. Using (6) and the fact that the derivative of y with respect to 
a parameter must vanish at equilibrium 


_98 “es £ 25 
228 _=are(itss), (25) 
a — = ytAT (In c+tot+cg). (26) 


With the equilibrium values of o, g from (10), (12) and noting that a= 
gi = 0 by reason of the isotropic conditions (14), then combining (24), 


(25), (26) 


and 
In ca—In Krag = 10 C: - 
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These two equations may be further combined to give 
Choe aor Wa Ce (27) 


as the condition for coexistence of the separate phases. 

The procedure is now clear for the examination of the individual cases 
in (1). The minimum concentration will be found from (15) for which 
phase transitions to the anisotropic state are possible. The concentration 
region in which the separate phases can exist together in equilibrium is 
found from (27) which also shows the individual concentrations of these 
two states there. And, finally, unstable transitions are detected by (23). 

Case I: 6i(u) = Diu. The following may be computed directly from 
the definitions (11), (5), (16), (17), (4): 


1 Lae 
Ki=q, Juda =—— sinh (ie (28) 
at 6 efi Pi) is Oy eb# 
Ai) eae eet da cit ee C 
Fin St fa eda cork (c oa (30) 
4m sinh ( LO Ps 
M,= pre SENSE, St 2eS#dQ = 1 - 2 (coth G ) (31) 
4m sinh ¢ a ae $1 
o1 o1 
i= f=. 
*  coth me 2) 
%1 


When (28) and (32) are inserted into (15) it can be seen that c = 3/D, 
(where {; = 0) is the minimum concentration at which the solution can 
become anisotropic. Examination of (27) with (28) and (32) shows that 
this concentration is the only equilibrium transition point. And, finally, 
numerical analysis of condition (23) with (30) and (31), which reduces to 


shows that this transition is stable. Thus at c = 3/D, there occurs a transi- 
tion of the so-called second kind from isotropic to nematic states with a 
distribution about the polar axis given by (29). The particle alignment, as 


measured by the order function g, increases with the concentration beyond 
this point. 
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Turning now to Case II where 
Ba (u) = 32 (3 ut 1) , 


the following quantities may again be computed directly from the defini- 
tion (11), (S), (16), (17), (4): 


ee Sf e2/2) Be*-1) dQ = Ee~:/2) (33) 
where 
E(t) =f eemtwdy 
0 
1 VG/)E, 
naees ef for 622 0 
2 2 
Legh Td V (3/2) |f, 
“Vel f ‘| for fas UF 
e (3/2) Eu? 
Lr 1 (e@/t, 
PF = aR 3 yu? — 1) e8/2)5e'dQ = ( a )- 
>= ag J (But 1) e@/ntw'da = et a a 
(3/2) £5 
M2= abt eats =< (3-%) |, 8) 
oi Ks! Ce es ay) 
feck ) 


The relation (37) is shown in Figure 1. The condition (15) with (33), (37) 
is plotted in Figure 2 and, in conjunction with the previous figure, shows 
that the minimum concentration at which a transition may occur is 
c = 4.54/Dp. 

Substituting (33) and (37) into the criterion (27) for the equilibrium 
concentrations, it is seen from Figure 3 that two such points exist. At 
f = 0, ¢ = 5/D» (Fig. 1) and by (27) this is the concentration of both 
phases. Thus a continuous equilibrium transition of the second kind is 
possible here as in Case I. However, the point {2 = —1.70 of Figure 3 
corresponds to an anisotropic concentration of 6.85/D: and from (27) 


a ACE At) 
C; = Ca (1 Mad i = 


' 
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so that in this concentration region a discontinuous equilibrium transition 
of the first kind is also possible, with a concentration ratio of 1.27 for the 
mixture. Thus 7f the solution is isotropic up toc = 5.40/D, an anisotropic 
phase begins to form, and as the total particle concentration is increased 
this more concentrated, ordered state grows in amount until at c = 
6.85/D; the mixture has disappeared and the whole solution is anisotropic. 
As the total concentration is made to increase beyond this point, the 
ordering increases also. 


1 
PS NS. 
1-—= 
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FIGURE 3 


This Case II is to be further distinguished from Case I. Here it will be 
noted that ¢ < 0 for the anisotropic state. An examination of (34) for 
negative f reveals that the particles tend to align about the direction 
§ = 1/2. Since the foregoing has been derived on the assumption that 
the orientation is to be independent of the longitudinal angle ¢, the par- 
ticles must be considered as tending to align in directions perpendicular to 
the optical axis 6 = 0. There is, therefore, no common axial direction of 
the particles as in Case I. j Any 

Thus there are three possible transitions in this case. With increasing 
concentration the first is a #on-equilibrium discontinuous transition to 
the nematic state (f > 0) at c = 4.54/D. Failing this, at ¢c = 5.00/D; 
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FIGURE 4 
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there is a possible equilibrium continuous phase transition of the second 
kind to a non-nematic state. And at c = 5.40/De, an equilibrium transi- 
tion of the first kind to a non-nematic state is possible by means of a mix- 
ture or coacervate. 


Finally Figure 4, a graph of (23) with (35), (36), (37) which reduces to 


1 €(3/2)f2 / e(8/2)S5 1 
2— a ae 
“al E paras 36) SH 


shows that all three anisotropic states are internally stable. However, in 
terms of relative stability it is seen from Figure 5 (eq. (13)) that the free 
energy (and yw) of the nematic phase formed at A (c = 4.54/D;) is lowest, 
so that the other anisotropic states must be regarded as metastable. 
Whether or not either or both equilibrium transitions could be made 
experimentally to occur before the whole solution became nematic is a 
question which cannot be answered by these methods of equilibrium 
thermodynamics. It seems unlikely, however, that mixtures or so-called 
coacervates would form under the restrictions set forth here. 


I should like to acknowledge the help of Dr. I. Isenberg in many 
critical discussions on this problem. 

This investigation was supported by a research grant G-3312 from The 
National Cancer Institute, of the National Institutes of Health, Public 
Health Service. 
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Equations are derived to represent the time course of the population numbers of the vari- 
ous stages of the flour beetle. The assumption of constant duration of the life stages and the 
absence of delayed effects leads to equations from which the various population numbers can 
be calculated in terms of the parameters of the system. Formulae are given for the estimation of 
many of the latter from observations on population numbers. Calculations show that the 
principal features of the observed changes in population structure can be accounted for on 
the basis of a simple model in which it is further assumed that each interaction is proportional 
to the product of the numbers of the two given interacting stages. A more detailed analysis 
may require secondary interaction coefficients. Suggestions for estimation of such coefficients 
are given. 


The following model attempts to give the values of the numbers of the 
various life history stages as a function of time in terms of various param- 
eters and functional relations. J. Stanley (1934) has given a general 
treatment of this type of problem. The following mathematical treatment 
requires that certain assumptions be made. However, we shall see subse- 
quently that the restrictions are not too strong but enable one to calcu- 
late the population variables readily when the parameters are known, 
and also enable one to calculate parameters from data on the population 
variables. In order to proceed we introduce the following assumptions 
and notations. 

Assumptions. (1) The durations of the various stages are constant and 
the same for all individuals. (2) The sex ratio is approximately constant. 
(3) The various functional relations (to be discussed subsequently) de- 
pend only upon the immediate values of the variables (or perhaps upon 
delayed values in which the delay is not appreciably greater than a census 
interval). (4) The mean value of any function during the interval between 
one census and the next is not very different from the average value of the 
function at the beginning and that at the end of the period. 

For the sake of definiteness and since the model has been worked out 
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principally for the flour beetle, we shall first consider this special case. 
That assumption (1) is justified for the flour beetle can be seen from the 
values of the coefficient of variation of some values given in Table 6 of 
D. Strawbridge (1953). As to assumption (2) there is some information 
to indicate that it is satisfactory (Strawbridge, Joc. cit.). An inspection of 
the data will show that assumption (4) is quite satisfactory except for 
perhaps a small percentage of the time. Assumption (3) is introduced 
since there is no evidence of important delayed effects in which neglecting 
the delay is serious. This point will be discussed subsequently. 

Notation. We introduce the following symbols to represent the popula- 
tion numbers for the various life history stages which we wish to calculate 
as functions of the time ¢: E, S, M, L, P, a, and A refer respectively to the 
eggs, small larvae, medium larvae, large larvae, pupae, immature adults, 
and mature adults. The time ¢ will be given in terms of periods of three 
days for reasons which will be evident. 

From the data given above (Strawbridge, Joc. cit.) we see that to a 
fairly satisfactory approximation the durations of the various phases are 
very nearly equal to the following integral values: eggs—4 days, small 
larvae (first two instar stages)—6 days, similarly, medium and large 
larvae, pupae, and immature adults—6 days. Thus in terms of the three- 
day census interval these values are 4, 2, 2, 2, 2, 2, respectively, for eggs, 
etc., through immature adults. The fact that observed values are so close 
to these integers simplifies the calculations. 

Let rg denote the number of fertile eggs laid per period of three days 
per adult, rz being a function of various variables. Similarly let Sz; be 
the value of the survival function for eggs, i.e., the function of a given 
group of eggs which remains at the end of a three-day period under a 
given set of conditions. Similarly let Ss, Sy, Sz, Sp, S, represent the sur- 
vival functions for the small, medium, and large larvae, the pupae and 
immature adults. These functions will depend upon the numbers £, S, 
..., A inamanner which depends upon the behavior of the members of 
the various stages. For the present, the mechanics of this behavior will 
not be discussed. Rather, the functions will be expressed graphically as 
well as possible from an analysis of the data, these values then being used 
in the equations to be derived. ' 

Derivation of equations. We first consider the expression determining 
the number of eggs at any time #. Those eggs laid prior to time iby more 
than 4 (or 4 days) will have hatched. Those laid at some time 7 to 7 + dr 
prior to t, re(t — 7)A(t — r)dr will remain as eggs if they survive cannibal- 
ism due to other stages. The probability of survival is just Sy 7(¢ — 7/2), 
t — 7 being the length of time they must survive, the survival function 
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being evaluated at the middle of the interval. If we integrate over all 
values of 7 from t = tr = 4 tot — + = 0 we will calculate the number of 
eggs at time /, that is, 


E (t) = fo ret=7) A(t—7r) Sy" (1-5) adr. (1) 


Since the integration is carried out over a rather short period of time com- 
pared to a life cycle time and because of assumption (4) we may replace 
the integral by the mean value theorem and assume that the mean value 
occurs at the midpoint of the interval. Thus we obtain the very close 
approximation 

E(t) = $rgA(t — 3)S#°(¢ — 4). (2) 


In this and subsequent expression we write rzA (¢) to mean r,(t)A(t). 
The derivation of the number of small larvae at any time ¢ may be 
made in a similar manner. Introducing mean values as in (2) we may 
say that the number of small larvae S(t) at time # is calculated by deter- 
mining how many of the eggs laid at a time between ¢ — 24 and ¢ — 4 
survived as eggs for 4 of a period and then survived for a time between 
0 and 2 periods as small larvae until the time ¢. Eggs laid before ¢ — 24 
would be medium larvae at ¢ while eggs laid after ¢ — * would still be 
eggs. The number of eggs laid during the interval would be very nearly 
2rzA (t — 14). The probability of survival as eggs for $ periods from ¢ — 25 
to t — 1 would be S#3(¢ — 12) and the probability of survival as larvae 
for an average of 1 period from ¢ — 1 to ¢ would be Ss(¢ — 4). Hence we 


find 
S(t) = IrgA(t — 24)SH¥3(¢ — 13)Ss(t — 3). (3) 


In exactly the same way we find the remaining population numbers, 
except for the adults: 


M(t) = IrgA(t — 44)S¥3¢ — 33)S3¢ — 2)Su(é — 3) , (4) 
L(t) = 2rgA(t — 64)S¥?(¢ — 53)S(¢ — 4) Silt — 2)Sir0 — 2), (5) 
P(t) = 2rgA(t — 84)S¥3(t — 73)S3(t — 6)Siet — 4) 7 
x SE(t — 2)Se¢ — 4) 
a(t) = 2rgA(t — 104)S¥3(¢ — 93)S3(¢ — 8)Sir(t — 6) 


(7) 
x SE(¢ — 4)S2(¢ — 2)S.(¢ — 4). 

For the adult number we can only calculate how many additional indi- 

viduals AA (f) entered the ranks from ¢ to ¢ + 1, since there is no definite 

life stage period. This calculation proceeds just as the above except that 

the mortality of the newly arriving mature adults is neglected in com- 
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parison to the mortality of adults already present. This latter is taken 
into account by a variable coefficient d. which is the mortality rate de- 
pending itself upon the age distribution of adults and other variables. 
Thus we have 
AA() = reA(t — 10§)S¥*(¢ — 103) S3(¢ — 84)Sk( — 64) SEU — 44) 
X SR(t — 23)S2(¢ — 4) — dA + 4) 

as the number to be added to A(t) to get A(é + 1). 

In addition to these formulae we have specific formulae for the first 


few periods resulting from the initial conditions. For example, the egg 
number at the end of the first can be readily seen to be 


E(1) = 8rgA(3)S¥7(3) (9) 
Similarly S(1) = 0, M(1) = M(2) = M(3) = 0, L(1) = 0, etc., and 
S(2) = 3rwA (3) S#3(1)S¥*(18) 
M(4) = 3rzA (3) S#3(1)S%(23) SH? (38) 


a(10) = $rwA (4) S¥*(1)S%(23) Ske(48) S42 (6) Sb (8H SY (8) 
S(3) = §reA(@)SH(1N) SUE 


1 
M(5) = §reA @)S¥*(14) 83H) SY" Az Aa 


a(11) = 3rzA (8) S¥9(15)S%(3E) Sir(54)S%,(74) SB (OE) S¥/6 (1055) 
AA(11) = 3reA(3)SH#*(1)S%(23)Si(43) S47 (62) S2(82)S2(103) 
—d,A(118). 
All subsequent values are given by formulae (2)-(8). All prior values 


are Zero. 


Under some conditions it is possible to determine the number of larvae 
that have hatched during a given three-day period. If A(#) denotes this 
quantity we can calculate A(#) from the following expression derived as 
expressions (1) to (7) above: 


H(t) = rgA(t — 18)S¥3(¢ — 1A)S¥2(¢ — 4). (11) 
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Now from (1) above we can write 
EX(t — 8) = 4r3.4%(¢ — 13)S¥3(¢ — 14). (12) 


From the ratio of these expressions we find 


i? (t—2) = 16 WO He =, 
HO a ea ar Ab a8) 2 (13) 
Adding 14 to the time, we find, 
2 
yA (t+4) = SY? ae (14) 


Since S¥? is numerically nearly one and varies only slightly, it may be 

considered to be a constant as compared with the other quantities which 
vary considerably. Taking S¥? = 0.92 as an average value from the data 
we may then calculate the egg-laying rate per adult per period from 
F2(t+1) 
Wee) oe) 
where small fractions of a period are ignored. Having thus estimated r;(2), 
we may now estimate S;,(#) from (1), etc., for other functions. This 
process is facilitated by noticing that the following ratios, which are 
easily obtained from experimental data, are relatively simple functions 
of the S aS 


S(é+1 : 
ney = 8 $48 (t— 2) S, (+4) [1 +2 A log S, ¢—3) +...] , (16) 


ry (t) = 0.5 —_— 


aS = S,(t—1) Sy +4) [14-4 A log S,(¢— 13) +...) 17) 


Ny) = — 5 Pee 18 
Fay 7 Su U- 1S, 4B) FB A log Sy (1 18) +. 1, (18) 


P(i+1 F ae 
pa = S51) S, G+) [1+ 3A log S, G 14)+...], (19) 


ja (é+1) _ a2 Loe Sadia) acolaien(20) 
PTT Sp (t— 1) S, +4) [1444 log S, (¢— 12) 


aa MS u(y = 111 pa log Se 04) +E ev lee(21) 


a (¢—1) 
Prt UY PmWD 5 (¢—1) [1+4$A log S, (6-14) +.--1, (22) 
CESS 
P(t+1) _ (23) 
TaGees L395 (¢+4), 
© Use ES Ae (24) 


ead) 
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In the above L» and P,, are the numbers of large larval and pupal 
molts found during a given period. For most practical purposes the cor- 
rection terms can be ignored, except where the numbers are changing 
very rapidly. From these expressions it is seen that Sp can be calculated 
independently in two ways. There is therefore a check on the consistency 
of the assumptions made. With Sp now calculated, there are two estimates 
of S, so that again internal consistency can be checked. Now S;, can be 
estimated and, similarly, Sj,, though with less certainty. 

General case. The equations for the calculation of the population num- 
bers for a species in which the various development times are Tz, Ts, 
..., Ip, T, can be obtained in the same manner as in the special case. 
It is again assumed that the effect of the distributions about ég, etc., are 
negligible and that the observations are made often enough to enable 
the parameters to be estimated. Only a few expressions are listed since the 
other expressions are cumbersome to write but easily obtained by gen- 
eralizing those given. A comparison with equations (2) to (10) will make 
this even easier. The equations are as follows: 


E(t) = TerwA(t — 37x) Sa"? (t — 37x) , 
S(t) = TsrzA(t — 3Ts — Tz)Si*(t — 37 s — 372) S37(t — 3Ts) , 
M(t) = TurzA(t — Tu — Ts — Tz)Si(t — 4Iu — Ts 
— 372) Ss°(¢ — 3Tu — 37s) Su (t — 2Tu) , 
AA(t) = rgA(t + 4 — C)SH(¢+ 4 — C + 472)S3 
KG+9—C+ oT s + Tx) Se“G +5 — 1, —Te— Ty 
— 97 s5)Si'(¢ +4 — T, — Tr — 3Tu) SP +4 
— T, — $7 p)So°lt +3 = 37.) — 2 ACL ee 


bl 


(25) 


In general, the smaller the unit of time and the larger the number of 
subdivisions the less the error introduced in the calculations. Fairly large 
divisions or fairly long periods may be used if the functions S are averaged 
over periods of time of the order of the corresponding T value. For ex- 
ample, in expression (8) the terms S%(¢ — 83) ...can be conveniently 
written as Ss(¢ — 9)Ss(¢ — 8).... 

The survival functions. Mortality factors from various causes act to- 
gether to determine each of the survival functions Sz, 9s, . 0-5 - If each 
type of mortality were independent, then the resulting survival would be 
the product of the individual survival factors, each being the compliment 
of the respective mortality fraction. If in addition the specified mortality 


=a of 
- ‘ ' 
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rate due to a single factor, e.g., predation of adults on eggs, is relatively 
independent of the numbers of adults or eggs within a range of values to 
be considered, then the contribution due to this factor would be exponen- 
tial. In general, however, it will be more complex [cf. eq. (28)]. 

In the above formulae we have attempted to use only data from a 
freely growing population to determine the survival functions. It may be 
hoped that, if various stages are removed from a population maintained 
at specified conditions and allowed to interact under conditions as little 
altered as possible, the interaction may represent that which is going on 
in the population. A check of this would be furnished if such independent 
results were found to agree with those calculated from the above equa- 
tions. For example, it is possible in some cases to mark eggs (Rich, 1954). 
If £*(1) eggs are incubated with A adults, then for the number E*(1) of 
eggs one period later we obtain directly the value of Sz for this particular 
situation as 5 
Sz= in ; (26) 
The resulting value for Sz would in general depend upon the conditions 
under which the adults had been maintained, whether or not they had 
been prevented from eating eggs, or if they had just been taken from 
steady-state conditions. It would of course depend upon the number A 
of adults present. The degree of independence of individual interactions 
would be measured by the closeness of the relation S;(A) to a negative 
exponential. Once Sz is calculated rg can be calculated directly from (9) 
where E(1) is the observed number of unmarked eggs one period after 
incubation of A adults in a medium containing no unmarked eggs. Thus 
from (26) we find 

EAA) j/E* (0) 
SMa eae ta E* (1) x) 

In the above we have assumed that an equal number of males and 
females are used. If not, then instead of rz we should write 2prg where p 
is the fraction of females. Thus p becomes a new variable and it becomes 
possible in principle to check whether or not the value of Sy for a mixed 
group is greater than, equal to, or less than that due to the combined 
effects of the males and females (Rich, Joc. cit.). In a similar way the degree 
to which the egg-laying rate is dependent upon the numbers of males or 
females and other factors can be measured [(Rich) (Birch, Park, and 
Frank, 1951)]. 

If adults are incubated with small larvae one can obtain direct values 
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for the contribution of a given number of adults to the value of Ss. In 
a similar way it may be possible to obtain independently all the primary 
interactions. These may be considered primary if in each case Sy decreases 
exponentially with the number of adults, i.e., — log Sy is proportional to 
A, the subscript y representing any stage. 

In the case of the larval stages it may not be possible to neglect the 
interaction within a stage. In this case first determine this effect, obtain- 
ing from the relation 


— log. Sy = CxxX (1 = YxxX) + aloo 


the primary coefficient Cyx of the cannibalism of X on X and a secondary 
interaction coefficient yyy from results in which there are X individuals 
of a given stage only. Similarly using only individuals in stage Y one 
obtains the corresponding coefficients Cyy and yyy. If now the stages are 
incubated together, then from Sy we have 


— log. Sy = CxxX(1 — yxxX) + CyxY(1 — yyxY) — I[xXY, (28) 


from which one can calculate the primary coefficient Cy and perhaps also 
estimate the secondary coefficient yyy and the interaction coefficient Ty. 
Note that a positive value of a secondary coefficient corresponds to inter- 
ference with cannibalism. From values of Sy one can similarly obtain esti- 
mates of Cxy, yxy, and ly, as well as the natural death rate which has 
been ignored. 

Special case: no interaction between or within mortality factors. If all the 
factors can be approximated by simple exponentials in the population 
numbers over a sufficiently large range and if S, = S¥*, S, = Ss,..., etc., 
and V, = E, Nz = S, etc., we may write 


S:= exp (— Cums), (29) 


where C;; is the mortality of 7 due to j. If the death rates cannot be neg- 
lected, then a quantity d; should be added under the summation sign. 
These quantities are themselves determined by the more basic quantities 
such as the kinetic factors determining “collision” frequencies, and quan- 
tities determining the outcome of a given contact. That such a simple 
system can account for the principal features is shown by a comparison 
of Figures 1 and 2. Figure 1 shows graphically a portion of the time varia- 
tion of the population numbers of an experimental curve (Strawbridge, 
loc. cit.), whereas Figure 2 shows calculated values in which rg = 25, C75 = 
C15 a Cra = 0.005, C73 aS Caz = 2C 33 = 0.003 = Cre — Cag = C39 = 2C 22, 
Cu= Ca = Car = 03, Car = 0.003, d; = 0.009, all other coefficients being 
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Ficure 1. Data on Tribolium castaneum Herbst, reproduction of Figure 3 of Straw- 


bridge (1953). Ordinate is the logarithm of the number of individuals. For further description, 
see legend of Figure 2. 


NUMBER OF INDIVIDUALS 


TIME IN DAYS 


FicureE 2. Population numbers as calculated from theory using three-day periods with 
rz, = 25,d = 0.009, S%/? = exp{—0.03(.1S + M+ L + A)}, Ss = exp{—0.003(.55 + M + 
L+A)}, Su = exp{—.003(.5M + L+A)}, Sp = Sa = Sp = exp (—.0054). The ordinate 
in this and subsequent figures is linear with log (1 + NV), N being any population number, Z 
denoting number of eggs, S, M, and LZ denoting numbers of small, medium, and large larvae, P, 
a, and A denoting the numbers of pupae, immature adults, and mature adults, a + A being 


the total number of adults or imagoes. 
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zero. A comparison of Figures 1 and 2 suggests that the principal differ- 
ence lies in the tendency of the amplitude of the periodic components of 
the calculated values to damp out in time. The results shown in Figure 1 
were obtained under conditions in which the flour was changed every 
fifteen days, thereby introducing an impressed force upon the system, 
the period being close to one-half of the time of a life cycle. This could, of 
course, tend to maintain the oscillation as shown by the calculated results 
of Strawbridge (Joc. cit.), where this model is used in a somewhat different 
manner. 
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Ficure 3. The effect of the initial number of adult pairs on the population numbers as 
calculated from the theory using the same parameters as in Figure 2. 


It would be of interest to study theoretically the effect of the various 
coefficients on the behavior of the system. Because of the complexity of 
the system it is difficult to make any general statements. For this reason 
a few calculations have been carried out to show the effect of the change 
in the values of some of the parameters. The effect of introducing one, 
three, ten, or thirty adult pairs is shown in Figure 3, the parameters other- 
wise being identical to those used for Figure 2. The results are shown for 
the total numbers of all adults (A + a), eggs, and large larvae. From the 
graphs it is seen that by twelve periods the number of adults is relatively 
independent of the initial number and, similarly, the numbers of eggs and 
larvae are not too greatly influenced by the initial number of adults Ay 
except that in the early periods the egg numbers increase with A, and 


TEMPORAL PATTERN OF A POPULATION STRUCTURE 13 


Vv 


the logarithms of the amplitudes are least for the case of Ay = 60 (cf. Park, 
1948). 

Using the mean values of the numbers of the various stages from Figure 
2, for the initial conditions, it being assumed that these numbers have 
been maintained sufficiently long, calculations were made to show that 
the population was stable (cf. early part of Figure 4). At the nineteenth 
period twenty adults were “removed,” being replaced two periods later. 
The results in the figure show that the disturbance of the balance in the 
population has a rather pronounced effect on the population numbers but 
that the effect is almost gone by the fiftieth period. 


NUMBER OF INDIVIDUALS 


TIME IN DAYS 


Ficure 4. The effect of an abrupt temporary change in the adult population number upon 
the subsequent population numbers. 


The effect of the relative rates of the egg-laying rate and the egg 
cannibalism on the stability is illustrated by the results in Figure 5. Here 
the egg-laying rate is cut in half and the egg cannibalism increased about 
three times so that the resulting adult number is about the same. It is seen 
that in this case the system is considerably more stable and that the other 
stages are, on the average, more numerous. Thus the situation in which 
cannibalism is greater results in a greater tendency toward oscillatory 
behavior. me 

We next consider a case in which the tendency to oscillatory behavior is 
greatly increased. One would expect this to be so particularly if the matrix 
of the interaction coefficients is non-zero only below the diagonal and if 
the rather uniform pressure of the adults is removed. Such a case is illus- 
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trated in Figure 6. For the first two life cycles the pattern is not too differ- 
ent from that of Figure 2. However, by the third cycle an increasing am- 
plitude becomes evident and by the thirtieth census period the oscillations 
are so great that the method of calculation breaks down because too rapid 
changes are occurring during a single unit of time. Thus the last few 
points no longer even approximately represent a realistic model. The cal- 
culations were carried out only to enable the graphing of the total adult 
numbers. These can be seen to be relatively unaffected by the great in- 
ternal changes. 

In this case it is possible to calculate steady state values for the popula- 
tion numbers. If these values are used as initial conditions the system re- 
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Ficure 5. An example to show the change in the temporal pattern of the population struc- 
ture as a result of reduced egg-laying rate together with increased egg cannibalism, the com- 
bination leaving the final adult numbers relatively unchanged: rg = 12.5, S?/* = exp{—0.01 
(.1S + M+ L + A)}, other parameters the same as in Figure 2. 


mains unchanged. But if, for example, the adult number is changed for a 
short interval we find that oscillation which follows builds up to such an 
extent that it no longer can be followed. This can be seen in the last part 
of the figure after the adult number has been artificially reduced to 80% 
for two census periods. These results suggest that if an experiment were 
carried out in which eggs were introduced into a container at a constant 
rate and pupae were removed at each census, then the amplitude of the 
oscillation of the numbers of the different stages should be much greater 
than that occurring in the presence of the adults. 

In this special case we have not only neglected secondary interactions 
but the effect of egg-eating by the adults on the egg-laying rate (Stanley, 
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joc. cit.). From Figure 1 it can be seen that the number of eggs does not 
vary nearly as much as do the other immature stages so that except for 
the first life cycle and especially from the first to the fifth census periods 
the ratio between the number of eggs and adults is relatively constant and 
probably not much different from that obtaining in the population from 
which they were taken. As a result, unless the egg-laying rate is fairly 
sensitive to the eating of eggs, one would not expect the effect to ap- 
preciably alter the principal trends. 

If we suppose that any interaction coefficient c;; is the product of a term 
representing the probability that stages 7 and 7 come within a prescribed 
distance 6,; of each other per unit time and a term representing the prob- 
ability »;; that if these stages come together 7 will destroy 7, then the co- 
efficient c;; can be written as 

Dig Vist Bi 

Deanery Tyree (30) 
where 2;; is the relative velocity for the pair 7, 7 and V is the volume of the 
system. If now both sides of the expressions from (1) to (10) are divided 
by V, then because of (29) and (30) the expressions involve only NV ,/V. If 
we had defined the V; as the number per unit volume, the volume V 
would no longer appear. Hence to the extent that (29) and (30) hold the 
volume should not play a role if one uses only the number densities (cf. 
Park, Joc. cit.). Implicitly assumed in (30) is that each linear dimension is 
large compared to 6;; so that the results would not hold for a thin, wide 
region, for example. Furthermore, the volume disappears only if the 
quantities c;; involve secondary effects to a sufficiently small extent, that 
is, the quantities y and J in (28) must be small. The fact that the volume 
plays a relatively small role suggests that these secondary effects can be 
neglected. 

It has been shown that the equations of the model satisfactorily repre- 
sent the principal feature of the population dynamics of the flour beetle. 
They enable one to calculate a number of the parameters under natural 
conditions. However, the equations are not of a kind which makes it pos- 
sible to obtain properties of the system easily. In spite of this it is possible 
to discuss some aspects of the problem of competition bteween species in 
terms of the model. A discussion of this problem will be reserved for a 
second paper. 


The author wishes to express appreciation to Professor Thomas Park 
for comments and suggestions. The author is indebted to Dr. Dennis 
Strawbridge for permission to reproduce Figure 1. 
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ERRATUM 


To the paper by N. Rashevsky entitled ‘“‘Topology and Life: In Search 
of General Mathematical Principles in Biology and Sociology” (Vol. 16, 
317-48). 

On page 334 of the above paper, third line of paragraph T;, the word 
‘non-residual’’ should be deleted. 
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